八、示例

實戰一、爬取京東商品頁面信息

import requestsurl = https://item.jd.com/5706771.htmltry: r = requests.get(url) r.raise_for_status() r.encoding = r.apparent_encoding print(r.text[:1000])except: print(Error)#返回結果:<!DOCTYPE HTML><html lang="zh-CN"><head> <!-- shouji --> <meta http-equiv="Content-Type" content="text/html; charset=gbk" /> <title>【華為Mate 10 Pro】華為 HUAWEI Mate 10 Pro 全網通 6GB+64GB 銀鑽灰 移動聯通電信4G手機 雙卡雙待【行情 報價 價格 評測】-京東</title> <meta name="keywords" content="HUAWEIMate 10 Pro,華為Mate 10 Pro,華為Mate 10 Pro報價,HUAWEIMate 10 Pro報價"/> <meta name="description" content="【華為Mate 10 Pro】京東JD.COM提供華為Mate 10 Pro正品行貨,並包括HUAWEIMate 10 Pro網購指南,以及華為Mate 10 Pro圖片、Mate 10 Pro參數、Mate 10 Pro評論、Mate 10 Pro心得、Mate 10 Pro技巧等信息,網購華為Mate 10 Pro上京東,放心又輕鬆" /> <meta name="format-detection" content="telephone=no"> <meta http-equiv="mobile-agent" content="format=xhtml; url=//item.m.jd.com/product/5706771.html"> <meta http-equiv="mobile-agent" content="format=html5; url=//item.m.jd.com/product/5706771.html"> <meta http-equiv="X-UA-Compatible" content="IE=Edge"> <link rel="canonical" href="//item.jd.com/5706771.html"/> <link rel="dns-prefetch" href="//misc.360buyimg.com"


實戰二、爬取亞馬遜商品頁面信息

修改user-agent,詳見課程


實戰三、 百度/360搜索關鍵詞提交(params用法)

百度的關鍵詞介面:

http://www.baidu.com/s?wd=keyword

360的關鍵詞介面:

http://www.so.com/s?q=keyword

示例:

import requestskv = {wd:Python}r = requests.get("http://www.baidu.com/s?", params=kv)#構造符合要求的url格式r.status_codeprint(r.request.url)print(len(r.text))#返回結果:http://www.baidu.com/s?wd=Python376980

推薦閱讀:

TAG:爬蟲計算機網路 | 網頁爬蟲 | python爬蟲 |