網頁爬蟲，但總是出現中文亂碼，求大神幫忙解決或看看問題在哪？

05-24

版本：Python3.x
運行系統：win7
編輯器：pycharm

爬取頁面：攜程的一個頁面（韓國首爾6日5晚半自助游·直飛+滑雪場或南怡島+樂天世界+1天自由活動-【攜程旅遊】）
代碼：
#!/usr/bin/env python3 # -*- coding: utf-8 -*- from urllib.request import urlopen from urllib.error import HTTPError from bs4 import BeautifulSoup def getComment(url): try: html = urlopen(url) except HTTPError as e: return None #網頁在伺服器上不存在，若伺服器不存在直接返回None try: soup = BeautifulSoup(html.read(),"lxml") comment = soup.body.find("ul",{"class":"detail_comment_list"}).find("li") except ArithmeticError as e: return None return comment comment = getComment("http://vacations.ctrip.com/grouptravel/p11504202s32.html#ctm_ref=va_hom_s32_prd_p1_l2_2_img") if comment == None: print("comment could not be found") else: comment 1 = comment.get_text() print(comment 1)
運行結果：

網上看了很多資料，decode來eecode去，可能因為基礎問題，看看知乎的大神該幫忙解決或者看看問題在哪

提問不貼源代碼文本，貼圖片，你想回答者對著圖片敲一遍，再幫你調試？

貼源代碼文本明天早上回復。

Python3 搞定了，換成requets

Python 3 標準庫是這樣的

編碼問題。

提問時貼源代碼比較好，這樣方便大家

requests，純天然非基因改造的模組，一個真正適合人類的HTTP模組。

不使用requests可能導致重複發明輪子症，無限debug症，編碼錯誤症，精神崩潰等。

——requests官方網頁。

建議使用requests庫，省去很多編碼的煩惱。