利用R語言繪製世界航班路線圖

作者簡介

taoyan:偽碼農,R語言愛好者,愛開源。個人博客: ytlogos.github.io/

往期回顧

R語言可視化學習筆記之相關矩陣可視化包ggcorrplot

R語言學習筆記之相關性矩陣分析及其可視化

ggplot2學習筆記系列之利用ggplot2繪製誤差棒及顯著性標記

ggplot2學習筆記系列之主題(theme)設置

用circlize包繪製circos-plot

一、 簡介

本文基於NASA的夜間地圖(nasa.gov/specials/black)的基礎上進行世界航班路線可視化,參考多篇博客以及可視化案例。

1. 包載入

本博客使用的包較多,利用pacman包里的p_load()函數進行載入

library(pacman)

p_load(tidyverse, data.table, geosphere, grid, jpeg, plyr)

2. 數據準備

使用的數據來自於OpenFlights.orgopenflights.org/data.ht)。

3.數據下載

download.file("raw.githubusercontent.com",destfile = "airlines.dat", mode = "wb")

download.file("raw.githubusercontent.com",destfile = "airports.dat", mode = "wb")

download.file("raw.githubusercontent.com",destfile = "routes.dat", mode = "wb")

4.數據導入

airlines <- fread("airlines.dat", sep = ",", skip = 1)

airports <- fread("airports.dat", sep = ",")

routes <- fread("routes.dat", sep = ",")

5. 數據整理

#添加列名colnames(airlines) <- c("airline_id", "name", "alias", "iata", "icao", "callisign", "country", "active")

colnames(airports) <- c("airport_id", "name", "city", "country","iata", "icao", "latitude", "longitude","altitude", "timezone","dst","tz_database_time_zone","type", "source")

colnames(routes) <- c("airline", "airline_id", "source_airport", "source_airport_id","destination_airport","destination_airport_id","codeshare", "stops","equipment")

#類型轉換

routes$airline_id <- as.numeric(routes$airline_id)

# airlines與routes數據融合

flights <- left_join(routes, airlines, by="airline_id")

# flights與airports數據融合

airports_orig <- airports[,c(5,7,8)]

colnames(airports_orig) <- c("source_airport","source_airport_lat", "source_airport_long")

airports_dest <- airports[, c(5, 7, 8)]

colnames(airports_dest) <- c("destination_airport", "destination_airport_lat", "destination_airport_long")

flights <- left_join(flights, airports_orig, by = "source_airport")

flights <- left_join(flights, airports_dest, by = "destination_airport")

#剔除缺失值

flights <- na.omit(flights, cols = c("source_airport_long", "source_airport_lat", "destination_airport_long", "destination_airport_lat"))

#最後數據如下

head(flights[,c(1:5)])

6. 下面就是準備地理信息數據

本文主要是可視化地理信息上的點與點之間的連接,這可以通過geosphere包里的函數gcIntermediate()很輕鬆實現。具體使用方法可以參考這裡(flowingdata.com/2011/05)。

# 按航空公司拆分數據集

flights_split <- split(flights, flights$name)

# Calculate intermediate points between each two locations

flights_all <- lapply(flights_split, function(x) gcIntermediate(x[, c("source_airport_long", "source_airport_lat")], x[, c("destination_airport_long", "destination_airport_lat")], n=100, breakAtDateLine = FALSE, addStartEnd = TRUE, sp = TRUE))

# 轉換為數據框

flights_fortified <- lapply(flights_all, function(x) ldply(x@lines, fortify))

# Unsplit lists

flights_fortified <- do.call("rbind", flights_fortified)

# Add and clean column with airline names

flights_fortified$name <- rownames(flights_fortified)

flights_fortified$name <- gsub("\..*", "", flights_fortified$name)

# Extract first and last observations for plotting source and destination points (i.e., airports)

flights_points <- flights_fortified %>%

group_by(group) %>%

filter(row_number() == 1 | row_number() == n())

二、 可視化

接下來就是進行可視化了,前面講了我們只是在NASA提供的夜間地球圖上面進行數據映射,所以第一我們需要獲取該背景地圖。

1. 圖片獲取並渲染

#下載圖片

download.file("nasa.gov/specials/black",destfile = "BlackMarble_2016_01deg.jpg", mode = "wb")

#載入並渲染圖片

earth <- readJPEG("BlackMarble_2016_01deg.jpg", native = TRUE)

earth <- rasterGrob(earth, interpolate = TRUE)

2. 數據映射

由於航空公司十分多,就挑選幾個有名的航空公司進行可視化。

(1) Lufthansa(德國漢莎航空公司)

ggplot() +

annotation_custom(earth, xmin = -180, xmax = 180, ymin = -90, ymax = 90) +

geom_path(aes(long, lat, group = id, color = name), alpha = 0.0, size = 0.0, data = flights_fortified) +

geom_path(aes(long, lat, group = id, color = name), alpha = 0.2, size = 0.3, color = "#f9ba00", data = flights_fortified[flights_fortified$name == "Lufthansa", ]) +

geom_point(data = flights_points[flights_points$name == "Lufthansa", ], aes(long, lat), alpha = 0.8, size = 0.1, colour = "white") +

theme(panel.background = element_rect(fill = "#05050f", colour = "#05050f"),panel.grid.major = element_blank(),panel.grid.minor = element_blank(),axis.title = element_blank(),axis.text = element_blank(),axis.ticks.length = unit(0, "cm"),legend.position = "none") +annotate("text", x = -150, y = -18, hjust = 0, size = 14,label = paste("Lufthansa"), color = "#f9ba00", family = "Helvetica Black") +

annotate("text", x = -150, y = -26, hjust = 0, size = 8,

label = paste("Flight routes"), color = "white") +

annotate("text", x = -150, y = -30, hjust = 0, size = 7,

label = paste("ytlogos.github.io || NASA.gov || OpenFlights.org"), color = "white", alpha = 0.5) +

coord_equal()

(2) Emirates(阿聯酋航空公司)

ggplot() +

annotation_custom(earth, xmin = -180, xmax = 180, ymin = -90, ymax = 90) +

geom_path(aes(long, lat, group = id, color = name), alpha = 0.0, size = 0.0, data = flights_fortified) +

geom_path(aes(long, lat, group = id, color = name), alpha = 0.2, size = 0.3, color = "#ff0000", data = flights_fortified[flights_fortified$name == "Emirates", ]) +

geom_point(data = flights_points[flights_points$name == "Emirates", ], aes(long, lat), alpha = 0.8, size = 0.1, colour = "white") +

theme(panel.background = element_rect(fill = "#05050f", colour = "#05050f"),

panel.grid.major = element_blank(),

panel.grid.minor = element_blank(),

axis.title = element_blank(),

axis.text = element_blank(),

axis.ticks.length = unit(0, "cm"),

legend.position = "none") +

annotate("text", x = -150, y = -18, hjust = 0, size = 14,

label = paste("Emirates"), color = "#ff0000", family = "Fontin") +

annotate("text", x = -150, y = -26, hjust = 0, size = 8,

label = paste("Flight routes"), color = "white") +

annotate("text", x = -150, y = -30, hjust = 0, size = 7,

label = paste("ytlogos.github.io || NASA.gov || OpenFlights.org"), color = "white", alpha = 0.5) +

coord_equal()

(3) British Airways(英國航空公司)

ggplot() +

annotation_custom(earth, xmin = -180, xmax = 180, ymin = -90, ymax = 90) +

geom_path(aes(long, lat, group = id, color = name), alpha = 0.0, size = 0.0, data = flights_fortified) +

geom_path(aes(long, lat, group = id, color = name), alpha = 0.2, size = 0.3, color = "#075aaa", data = flights_fortified[flights_fortified$name == "British Airways", ]) +

geom_point(data = flights_points[flights_points$name == "British Airways", ], aes(long, lat), alpha = 0.8, size = 0.1, colour = "white") +

theme(panel.background = element_rect(fill = "#05050f", colour = "#05050f"),

panel.grid.major = element_blank(),

panel.grid.minor = element_blank(),

axis.title = element_blank(),

axis.text = element_blank(),

axis.ticks.length = unit(0, "cm"),

legend.position = "none") +

annotate("text", x = -150, y = -18, hjust = 0, size = 14,

label = paste("BRITISH AIRWAYS"), color = "#075aaa", family = "Baker Signet Std") +

annotate("text", x = -150, y = -26, hjust = 0, size = 8,

label = paste("Flight routes"), color = "white") +

annotate("text", x = -150, y = -30, hjust = 0, size = 7,

label = paste("ytlogos.github.io || NASA.gov || OpenFlights.org"), color = "white", alpha = 0.5) +

coord_equal()

(4) Air China(中國國航)

ggplot() +

annotation_custom(earth, xmin = -180, xmax = 180, ymin = -90, ymax = 90) +

geom_path(aes(long, lat, group = id, color = name), alpha = 0.0, size = 0.0, data = flights_fortified) +

geom_path(aes(long, lat, group = id, color = name), alpha = 0.2, size = 0.3, color = "#F70C15", data = flights_fortified[flights_fortified$name == "Air China", ]) +

geom_point(data = flights_points[flights_points$name == "Air China", ], aes(long, lat), alpha = 0.8, size = 0.1, colour = "white") +

theme(panel.background = element_rect(fill = "#05050f", colour = "#05050f"),

panel.grid.major = element_blank(),

panel.grid.minor = element_blank(),

axis.title = element_blank(),

axis.text = element_blank(),

axis.ticks.length = unit(0, "cm"),

legend.position = "none") +

annotate("text", x = -150, y = -18, hjust = 0, size = 14,

label = paste("Air China"), color = "#F70C15", family = "Times New Roman") +

annotate("text", x = -150, y = -26, hjust = 0, size = 8,

label = paste("Flight routes"), color = "white") +

annotate("text", x = -150, y = -30, hjust = 0, size = 7,

label = paste("ytlogos.github.io || NASA.gov || OpenFlights.org"), color = "white", alpha = 0.5) +

coord_equal()

(5) China Southern Airlines(中國南航)

ggplot() +

annotation_custom(earth, xmin = -180, xmax = 180, ymin = -90, ymax = 90) +

geom_path(aes(long, lat, group = id, color = name), alpha = 0.0, size = 0.0, data = flights_fortified) +

geom_path(aes(long, lat, group = id, color = name), alpha = 0.2, size = 0.3, color = "#004D9D", data = flights_fortified[flights_fortified$name == "China Southern Airlines", ]) +

geom_point(data = flights_points[flights_points$name == "China Southern Airlines", ], aes(long, lat), alpha = 0.8, size = 0.1, colour = "white") +

theme(panel.background = element_rect(fill = "#05050f", colour = "#05050f"),

panel.grid.major = element_blank(),

panel.grid.minor = element_blank(),

axis.title = element_blank(),

axis.text = element_blank(),

axis.ticks.length = unit(0, "cm"),

legend.position = "none") +

annotate("text", x = -150, y = -18, hjust = 0, size = 14,

label = paste("China Southern Airlines"), color = "#004D9D", family = "Times New Roman") +

annotate("text", x = -150, y = -26, hjust = 0, size = 8,

label = paste("Flight routes"), color = "white") +

annotate("text", x = -150, y = -30, hjust = 0, size = 7,

label = paste("ytlogos.github.io || NASA.gov || OpenFlights.org"), color = "white", alpha = 0.5) +

coord_equal()

(6) 一次性映射多家航空公司航行路線

#抽取數據集

flights_subset <- c("Lufthansa", "Emirates", "British Airways")

flights_subset <- flights_fortified[flights_fortified$name %in% flights_subset, ]

flights_subset_points <- flights_subset%>%

group_by(group)%>%

filter(row_number()==1|row_number()==n())

#可視化

ggplot() +

annotation_custom(earth, xmin = -180, xmax = 180, ymin = -90, ymax = 90) +

geom_path(aes(long, lat, group = id, color = name), alpha = 0.2, size = 0.3, data = flights_subset) +

geom_point(data = flights_subset_points, aes(long, lat), alpha = 0.8, size = 0.1, colour = "white") +

scale_color_manual(values = c("#f9ba00", "#ff0000", "#075aaa")) +

theme(panel.background = element_rect(fill = "#05050f", colour = "#05050f"),

panel.grid.major = element_blank(),

panel.grid.minor = element_blank(),

axis.title = element_blank(),

axis.text = element_blank(),

axis.ticks.length = unit(0, "cm"),

legend.position = "none") +

annotate("text", x = -150, y = -4, hjust = 0, size = 14,

label = paste("Lufthansa"), color = "#f9ba00", family = "Helvetica Black") +

annotate("text", x = -150, y = -11, hjust = 0, size = 14,

label = paste("Emirates"), color = "#ff0000", family = "Fontin") +

annotate("text", x = -150, y = -18, hjust = 0, size = 14,

label = paste("BRITISH AIRWAYS"), color = "#075aaa", family = "Baker Signet Std") +

annotate("text", x = -150, y = -30, hjust = 0, size = 8,

label = paste("Flight routes"), color = "white") +

annotate("text", x = -150, y = -34, hjust = 0, size = 7,

label = paste("ytlogos.github.io || NASA.gov || OpenFlights.org"), color = "white", alpha = 0.5) +

coord_equal()

三、 SessionInfo

sessionInfo()

R version 3.4.3 (2017-11-30)

Platform: x86_64-w64-mingw32/x64 (64-bit)

Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:

[1] LC_COLLATE=Chinese (Simplified)_China.936 LC_CTYPE=Chinese (Simplified)_China.936

[3] LC_MONETARY=Chinese (Simplified)_China.936 LC_NUMERIC=C

[5] LC_TIME=Chinese (Simplified)_China.936

attached base packages:

[1] grid stats graphics grDevices utils datasets methods base

other attached packages:

[1] plyr_1.8.4 jpeg_0.1-8 geosphere_1.5-7 data.table_1.10.4-3

[5] forcats_0.2.0 stringr_1.2.0 dplyr_0.7.4 purrr_0.2.4

[9] readr_1.1.1 tidyr_0.8.0 tibble_1.4.2 ggplot2_2.2.1.9000

[13] tidyverse_1.2.1 pacman_0.4.6

loaded via a namespace (and not attached):

[1] Rcpp_0.12.15 cellranger_1.1.0 pillar_1.1.0 compiler_3.4.3 bindr_0.1

[6] tools_3.4.3 lubridate_1.7.1 jsonlite_1.5 nlme_3.1-131 gtable_0.2.0

[11] lattice_0.20-35 pkgconfig_2.0.1 rlang_0.1.6 psych_1.7.8 cli_1.0.0

[16] rstudioapi_0.7 yaml_2.1.16 parallel_3.4.3 haven_1.1.1 bindrcpp_0.2

[21] xml2_1.2.0 httr_1.3.1 knitr_1.19 hms_0.4.1 glue_1.2.0

[26] R6_2.2.2 readxl_1.0.0 foreign_0.8-69 sp_1.2-7 modelr_0.1.1

[31] reshape2_1.4.3 magrittr_1.5 scales_0.5.0.9000 rvest_0.3.2 assertthat_0.2.0

[36] mnormt_1.5-5 colorspace_1.3-2 stringi_1.1.6 lazyeval_0.2.1 munsell_0.4.3

[41] broom_0.4.3 crayon_1.3.4

往期精彩內容整理合集

2017年R語言發展報告(國內)

R語言中文社區歷史文章整理(作者篇)

R語言中文社區歷史文章整理(類型篇)


推薦閱讀:

藉助WebGL三維可視化技術檢索3D動態圖像
《OurwayBI使用技巧之神奇的聯動與鑽取》精彩回顧
Matplotlib中關於坐標軸的控制
KB01:DataMapA教程-安裝和更新、卸載

TAG:R編程語言 | 數據可視化 | 數據挖掘 |