Python可視化與basemap數據地圖

Python可視化與basemap數據地圖

來自專欄 R語言數據分析與可視化

最近在梳理Python中可以製作數據地圖的可視化工具包,分別實踐了geopandas、folium、Basemp,通過對比發現,靜態地圖中最為成熟的最終還得是Basemap工具,它是mpl_toolkits包中的一個專門用於構建地理信息數據可視化的擴展庫。Basemap工具在地理信息讀寫、坐標映射、空間坐標轉化與投影等方面做的要比geopandas更加成熟,它可以使用常規的地圖素材數據源(shp)作為底圖進行疊加繪圖,效果與精度控制比較方便,圖表質量堪比R語言中的ggplot2繪圖包(geom_polygon),唯一不足的是它是一個底層構建工具,所有的多邊形映射都需要手動構造循環(目前還沒有發現比較好用的基於basemap的擴展工具),作圖效率與速度上自然無法媲美R語言的ggplot2(缺少一套健全的頂層語法支撐)。接下來會用3~5篇的篇幅分享給大家基於basemap包的應用場景,包含散點圖(氣泡圖)、折現圖(路徑圖等線圖類型)以及最常用的熱力填充地圖。本小節介紹填充地圖與散點圖應用,案例是使用itchat介面抓取的本人微信好友信息。import itchatimport numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport matplotlibfrom matplotlib.patches import Polygonfrom mpl_toolkits.basemap import Basemapfrom matplotlib.collections import PatchCollection1、微信網頁版登錄:itchat.login()#使用手機微信掃一掃掃描彈出二維碼即可登錄。#Getting uuid of QR code.#Downloading QR code.#Please scan the QR code to log in.#Please press confirm on your phone.#Loading the contact, this may take a little while.#Login successfully as 杜雨#提取微信好友信息:friends = itchat.get_friends(update=True)df_friends = pd.DataFrame(friends)df_friends.to_csv("wechat_friends.csv",encoding = "utf_8_sig")#friends = pd.read_csv("D:/Python/File/wechat_friends.csv")mydata = friends.loc[:,["NickName","Province","Signature"]]2、聚合計算好友地區分布:aggResult = mydata.groupby([Province])[NickName].agg({人數: np.size}).reset_index()aggResult.sort_values(by = [人數],ascending = False,inplace=True)#拆分國內城市與國外城市:def match_str(item): result = [] for i in item: try: m = re.search("^[u4e00-u9fa5]{1,}",i).group() result.append(m) except: continue return(result)Domestic = match_str(aggResult["Province"].tolist())Domestic = aggResult.loc[aggResult.Province.isin(Domestic),:]Foreign = aggResult.loc[aggResult.Province.isin([i for i in aggResult.Province.tolist() if i not in Domestic.Province.tolist()]),:]Domestic[scala] = (Domestic.人數-Domestic.人數.min())/(Domestic.人數.max()-Domestic.人數.min())清洗與矯正省份(地區)名稱def correct(name_list): name = [] for i in name_list: if i in ["內蒙古","西藏"]: i += "自治區" elif i == "寧夏": i += "回族自治區" elif i == "新疆": i += "維吾爾族自治區" elif i == "廣西": i += "壯族自治區" elif i in ["香港","澳門","台灣"]: i += "特別行政區" elif i in ["北京","天津","重慶","上海"]: i += "市" else: i += "省"name.append(i) return(name)Domestic["Province"] = correct(Domestic["Province"])3、合併本地經緯度數據: #散點圖數據源:point_data = pd.read_csv("D:/R/rstudy/Province/chinaprovincecity.csv",encoding = "gbk") Domestic = Domestic.merge(point_data.loc[:,["province","jd","wd"]],how = "left",left_on = "Province",right_on = "province")

實例化地圖對象,並導入本地shp中國地圖basemap = Basemap(llcrnrlon= 75,llcrnrlat=10,urcrnrlon=150,urcrnrlat=55,projection=poly,lon_0 = 116.65,lat_0 = 40.02,ax = ax)basemap.readshapefile(shapefile = "D:/R/rstudy/CHN_adm/bou2_4p",name = "china")導入的shp格式地圖中很多行政區劃信息亂碼,需要糾正編碼mapData = pd.DataFrame(basemap.china_info)mapData["NAME"] = mapData["NAME"].map(lambda x: x.decode("gbk") if len(x) != 0 else x)#mapData["NAME"] = [i.decode("gbk") if len(i) !=0 else i for i in mapData["NAME"].tolist()]mapData = mapData.merge(Domestic,how = "left",left_on=NAME, right_on="Province")

4、數據可視化font = {family : SimHei};matplotlib.rc(font, **font);fig = plt.figure(figsize=(16,12))ax = fig.add_subplot(111)###構建省份填充函數(按照各省好友人數比例):def plotProvince(row): mainColor = (42/256, 87/256, 141/256,row[scala]); patches = [] for info,shape in zip(mapData["NAME"].tolist(),basemap.china): if info == row[Province]: patches.append(Polygon(xy = np.array(shape), closed=True)) ax.add_collection(PatchCollection(patches,facecolor=mainColor,edgecolor=mainColor,linewidths=1.,zorder=2))Domestic.apply(lambda row: plotProvince(row), axis=1) #構建散點圖(基於各省好友數量)def create_great_points(df): lon = np.array(df["jd"]) lat = np.array(df["wd"]) pop = np.array(df["scala"],dtype=float) x,y = basemap(lon,lat) for lon,lat,pop in zip(x,y,pop*50): basemap.scatter(lon,lat,color = "#c72e29",marker = "o",s = pop*25)create_great_points(Domestic)plt.axis("off") #關閉坐標軸plt.savefig("D:/Python/Image/杜雨/itwechat.png") #保存圖表到本地plt.show() #顯示圖表

整個內容中涉及到的bou2_4p.shp,chinaprovincecity.csv均為之前推送過的R語言ggplot2系列所用數據源,公開在github上:

github.com/ljtyduyu/Dat,friends數據集是直接用itchat包掃碼登錄獲取的好友數據,無需多餘配置,整個過程非常簡單。

寫在最後!!!


關於basemap包構建地圖的資料實在是太少了,整整整理好好幾天,逛了N多個Stack Overflow才打通這個流程,一定要珍惜哦,如果覺著這還不夠過癮,最近正在錄製的課程《R語言商務圖表與數據可視化》已經更新到第九章了,足足四章的地理信息可視化模型、原理、應用一定會讓你收穫滿滿,趕快來瞧瞧吧~

edu.hellobi.com/course/

閱讀原文

推薦閱讀:

深度 | R vs Python:R是現在最好的數據科學語言嗎?
Flask的g對象,範圍是什麼?
再也不用擔心網頁編碼的坑了!
Python英文搜索引擎

TAG:數據可視化 | Python | 數據分析 |