来源:http://www.niehonglei.info/archives/765.html
由于有过惨痛教训,老是错过重要的考试报名通知,所以这段代码抓取了上海教育考试院的报名页面,分析并发送短信通知相应的考试人员,然后将这个页面放到服务器上,每日定时检查,一有新的报名信息就能及时得知。中间涉及到部分php的知识点
fscoketopen,上一篇已经写过相应的介绍,即可以用其来模拟web service调用,也可以使用其来抓取页面
正则表达式,preg_match,用来提取抓取后页面返回的数据
编码转换iconv,抓取的页面是GBK类型,如果不转换,则在控制台显示乱码。第二个是发送短信的接口,由于我使用的接口是GBK数据,所以我需要再从UTF-8转换到GBK
php文件的相关操作函数:file_exists用来判断文件是否已经存在,fopen用来打开文件,fgets用来读取一行,fputs用来写入一行。
数组模拟push方法,使用arr[] = something;的形式
vim 格式化代码 :gg=G
vim
批量添加 //注释::10,50s#^#//#g ;批量删除 :10,50s#^//##g 定时的方法使用crontab -e,就可打开定时列表,设置成每天10点通知: 0 10 * * * /path/to/php /path/to/spta.php spta.php function get_spta(){ $content = ”; $fp = fsockopen(‘www.spta.gov.cn’, 80); fwrite($fp, “GET /appendix/wsbm.html HTTP/1.0\r\n”); fwrite($fp, “Host: www.spta.gov.cn\r\n”); fwrite($fp, “Content-Type: text/html; charset=utf-8\r\n”); fwrite($fp, “Content-Length: “.strlen($content).”\r\n”); fwrite($fp, “\r\n”); fwrite($fp, $content); $item = array(); while(!feof($fp)){ $result = iconv(‘GBK’, ‘UTF-8′, fgets($fp)); if(strpos($result, ‘
’) > 0){ preg_match(‘/>([^<]*)</’, $result, $matches); $title = $matches[1]; $url = iconv(‘GBK’, ‘UTF-8′, fgets($fp)); preg_match(‘/href=”([^"]*)”/’, $url, $matches); $url = $matches[1]; $item[] = array(‘title’ => $title, ‘url’ => $url); } } fclose($fp); return $item; } function sent_sms($mobile, $msg){ $vars = “&mobs=$mobile&msg=” . iconv(‘UTF-8′, ‘GBK’, $msg); $fp = fsockopen(‘smsserver.com‘, 80); fwrite($fp, “GET /sms?$vars HTTP/1.0\r\n”); fwrite($fp, “Host: smsserver.com\r\n”); fwrite($fp, “\r\n”); fwrite($fp, $content); fclose($fp); } $items = get_spta(); $title = ”; $path = ‘/path/to/spta’; if(file_exists($path)){ $file = fopen($path, ‘r’); $title = fgets($file); } foreach($items as $bean){ if($title != $bean['title']){ sent_sms(‘your mobile‘, $bean['title'] . ‘[考试院]‘); echo $bean['title'] . “\n”; } else { break; } } $file = fopen($path, ‘w’); fputs($file, $items[0]['title']);
批量添加 //注释::10,50s#^#//#g ;批量删除 :10,50s#^//##g 定时的方法使用crontab -e,就可打开定时列表,设置成每天10点通知: 0 10 * * * /path/to/php /path/to/spta.php spta.php function get_spta(){ $content = ”; $fp = fsockopen(‘www.spta.gov.cn’, 80); fwrite($fp, “GET /appendix/wsbm.html HTTP/1.0\r\n”); fwrite($fp, “Host: www.spta.gov.cn\r\n”); fwrite($fp, “Content-Type: text/html; charset=utf-8\r\n”); fwrite($fp, “Content-Length: “.strlen($content).”\r\n”); fwrite($fp, “\r\n”); fwrite($fp, $content); $item = array(); while(!feof($fp)){ $result = iconv(‘GBK’, ‘UTF-8′, fgets($fp)); if(strpos($result, ‘
’) > 0){ preg_match(‘/>([^<]*)</’, $result, $matches); $title = $matches[1]; $url = iconv(‘GBK’, ‘UTF-8′, fgets($fp)); preg_match(‘/href=”([^"]*)”/’, $url, $matches); $url = $matches[1]; $item[] = array(‘title’ => $title, ‘url’ => $url); } } fclose($fp); return $item; } function sent_sms($mobile, $msg){ $vars = “&mobs=$mobile&msg=” . iconv(‘UTF-8′, ‘GBK’, $msg); $fp = fsockopen(‘smsserver.com‘, 80); fwrite($fp, “GET /sms?$vars HTTP/1.0\r\n”); fwrite($fp, “Host: smsserver.com\r\n”); fwrite($fp, “\r\n”); fwrite($fp, $content); fclose($fp); } $items = get_spta(); $title = ”; $path = ‘/path/to/spta’; if(file_exists($path)){ $file = fopen($path, ‘r’); $title = fgets($file); } foreach($items as $bean){ if($title != $bean['title']){ sent_sms(‘your mobile‘, $bean['title'] . ‘[考试院]‘); echo $bean['title'] . “\n”; } else { break; } } $file = fopen($path, ‘w’); fputs($file, $items[0]['title']);

















