有时候我们需要获取网页的标题与内容,就是个采集函数,这里简单分享下,方便需要的朋友, 代码如下: $c = curl_init(); $url = 'blog.0735dj.com'; curl_setopt($c, CURLOPT_URL, $url); curl_setopt($c, CURLOPT_RETURNTRANSFER, 1); $data = curl_exec($c); curl_close($c); $pos = strpos($data,'utf-8'); if($pos===false){$data = iconv("gbk","utf-8",$data);} preg_match("/<title>(.*)<\/title>/i",$data, $stitle); eregi("</head>(.*)</body>",$data,$sbody); $title = $stitle[1];//获取标题 $slbody = preg_replace(array("'<style(.*?)>(.*?)</style>'is","'<script(.*?)>(.*?)</script>'is","/<\/?[^>]+>/i"),'',$sbody[1]);//获取纯文本内容 |
|
来自: 昵称48103793 > 《php》