R語言中不能進行深度學習？

01-28

眾所周知，R語言是統計分析最好用的語言。但在Keras和TensorFlow的幫助下，R語言也可以進行深度學習了。

在機器學習的語言的選擇上，R和Python之間選擇一直是一個有爭議的話題。但隨著深度學習的爆炸性增長，越來越多的人選擇了Python，因為它有一個很大的深度學習庫和框架，而R卻沒有（直到現在）。

但是我就是想使用R語言進入深度學習空間，所以我就從Python領域轉入到了R領域，繼續我的深度學習的研究了。這可能看起來幾乎不可能的。但是今天這變成了可能。

隨著Keras在R上的推出，R與Python的鬥爭回到了中心。Python慢慢成為了最流行的深度學習模型。但是，隨著Keras庫在R後端的發布，並且在後台還可以使用張力流（TensorFlow）（CPU和GPU兼容性），所以在深度學習領域，R將再次與Python打成平手。

下面我們將看到如何使用Tensorflow在R中安裝Keras，並在RStudio的經典MNIST數據集上構建我們的第一個神經網路模型。

1.在後端安裝帶有張量的Keras。

2.使用Keras可以在R中構建不同類型的模型。

3.在R中使用MLP對MNIST手寫數字進行分類。

4.將MNIST結果與Python中的等效代碼進行比較。

5.結束筆記。

1.在後端安裝帶有TensorFlow的Keras。

在RStudio中安裝Keras的步驟非常簡單。只需按照以下步驟，您將很順利的在R中創建您的第一個神經網路模型。

install.packages("devtools")ndevtools::install_github("rstudio/keras")n

上述步驟將從GitHub倉庫載入keras庫。現在是將keras載入到R並安裝TensorFlow的時候了。

library(keras)

默認情況下，RStudio載入TensorFlow的CPU版本。使用以下命令下載TensorFlow的CPU版本。

install_tensorflow()

要為單個用戶/桌面系統安裝具有GPU支持的TensorFlow版本，請使用以下命令。

install_tensorflow(gpu=TRUE)n

有關更多的用戶安裝，請參閱本安裝指南。

現在我們在RStudio中安裝了keras和TensorFlow，讓我們在R中啟動和構建我們的第一個神經網路來解決MNIST數據集

2.使用keras可以在R中構建的不同類型的模型

以下是使用Keras可以在R中構建的模型列表。

1.多層感知器

2.卷積神經網路

3.循環神經網路

4.Skip-Gram模型

5.使用預先訓練的模型，如VGG16，RESNET等

6.微調預先訓練的模型。

讓我們開始構建一個非常簡單的MLP模型，只需一個隱藏的層來嘗試分類手寫數字。

3.使用R中的MLP對MNIST手寫數字進行分類

#loading keras librarynlibrary(keras)n#loading the keras inbuilt mnist datasetndata<-dataset_mnist()n#separating train and test filentrain_x<-data$train$xntrain_y<-data$train$yntest_x<-data$test$xntest_y<-data$test$ynrm(data)n# converting a 2D array into a 1D array for feeding into the MLP and normalising the matrixntrain_x <- array(train_x, dim = c(dim(train_x)[1], prod(dim(train_x)[-1]))) / 255ntest_x <- array(test_x, dim = c(dim(test_x)[1], prod(dim(test_x)[-1]))) / 255n#converting the target variable to once hot encoded vectors using keras inbuilt functionntrain_y<-to_categorical(train_y,10)ntest_y<-to_categorical(test_y,10)n#defining a keras sequential modelnmodel <- keras_model_sequential()n#defining the model with 1 input layer[784 neurons], 1 hidden layer[784 neurons] with dropout rate 0.4 and 1 output layer[10 neurons]n#i.e number of digits from 0 to 9nmodel %>% nlayer_dense(units = 784, input_shape = 784) %>% nlayer_dropout(rate=0.4)%>%nlayer_activation(activation = relu) %>% nlayer_dense(units = 10) %>% nlayer_activation(activation = softmax)n#compiling the defined model with metric = accuracy and optimiser as adam.nmodel %>% compile(nloss = categorical_crossentropy,noptimizer = adam,nmetrics = c(accuracy)n)n#fitting the model on the training datasetnmodel %>% fit(train_x, train_y, epochs = 100, batch_size = 128)n#Evaluating model on the cross validation datasetnloss_and_metrics <- model %>% evaluate(test_x, test_y, batch_size = 128)n

上述代碼的訓練精度為99.14，驗證準確率為96.89。代碼在i5處理器上運行，運行時間為13.5秒，而在TITANx GPU上，驗證精度為98.44，平均運行時間為2秒。

4.MLP使用keras–R VS Python

為了比較起見，我也在Python中實現了上述的MNIST問題。我覺得在keras-R和Python中應該沒有任何區別，因為R中的keras創建了一個conda實例並在其中運行keras。你可以嘗試運行一下下面等效的python代碼。

#importing the required libraries for the MLP modelnimport kerasnfrom keras.models import Sequentialnimport numpy as npn#loading the MNIST dataset from kerasnfrom keras.datasets import mnistn(x_train, y_train), (x_test, y_test) = mnist.load_data()n#reshaping the x_train, y_train, x_test and y_test to conform to MLP input and output dimensionsnx_train=np.reshape(x_train,(x_train.shape[0],-1))/255nx_test=np.reshape(x_test,(x_test.shape[0],-1))/255nimport pandas as pdny_train=pd.get_dummies(y_train)ny_test=pd.get_dummies(y_test)n#performing one-hot encoding on target variables for train and testny_train=np.array(y_train)ny_test=np.array(y_test)n#defining model with one input layer[784 neurons], 1 hidden layer[784 neurons] with dropout rate 0.4 and 1 output layer [10 #neurons]nmodel=Sequential()nfrom keras.layers import Densenmodel.add(Dense(784, input_dim=784, activation=relu))nkeras.layers.core.Dropout(rate=0.4)nmodel.add(Dense(10,input_dim=784,activation=softmax))n# compiling model using adam optimiser and accuracy as metricnmodel.compile(loss=categorical_crossentropy, optimizer="adam", metrics=[accuracy])n# fitting model and performing validationnmodel.fit(x_train,y_train,epochs=50,batch_size=128,validation_data=(x_test,y_test))n

上述模型在同一GPU上實現了98.42的驗證精度。所以，我們最初猜到的結果是正確的。

5.結束筆記

如果這是你在R的第一個深度學習模型，我希望你喜歡它。通過一個非常簡單的代碼，您可以有98%位準確率對是否為手寫數字進行分類。這應該是足夠的動力讓你開始深度學習。

如果您已經在Python中使用keras深度學習庫，那麼您將在R中找到keras庫的語法和結構與Python中相似的地方。事實上，R中的keras包創建了一個conda環境，並安裝了在該環境中運行keras所需的一切。但是，讓我更為激動的是，現在看到數據科學家在R中建立現實生活中的深層次的學習模型。據說 - 競爭應該永遠不會停止。我也想聽聽你對這一新發展觀點的看法。你可以在下面留言分享你的看法。

本文由北郵@愛可可-愛生活老師推薦，阿里云云棲社區組織翻譯。

文章原標題《Getting started with Deep Learning using Keras and TensorFlow in R》，作者： NSS ，

譯者：袁虎，審閱：阿福

文章為簡譯，更為詳細的內容，請查看原文

另外附上：

作者：雲大學小編
鏈接：R語言相關圖書？ - 知乎
來源：知乎
著作權歸作者所有。商業轉載請聯繫作者獲得授權，非商業轉載請註明出處。

阿里雲大學聯合尚學堂推出了R語言入門與實戰課程，點擊進行學習：大數據之R語言速成與實戰 - 阿里雲大學

以下為課程目錄：

R語言是什麼、R的優勢、資源n
R的安裝、獲取幫助、工作空間管理n
R包的使用、結果的重用、如何處理大數據集n
R數據集的概念、向量、矩陣和數組n
R數據框、因子和列表n
R的常用命令n
R的list列表詳解n
R的數據源導入方法n
R的用戶自定義函數n
R訪問MySQL資料庫n
R的集成開發環境(IDE)--Rstudion
R如何畫圖，圖形參數、符號、線條和顏色n
R圖形的文本屬性、尺寸、標題和自定義坐標軸n
R圖形的次要刻度線、參考線、圖例和文本標註n
R的圖形組合、圖形布局的精細控制n
R基本的數據管理--創建變數、變數重編碼和重命名n
R基本數據管理--如何處理缺失值、日期值得使用、數據類型轉換n
R基本數據管理--數據集合併、子集的提取以及隨機抽樣函數n
R高級數據管理--數學函數、統計函數和概率函數n
R高級數據管理--字元處理函數、將函數應用於矩陣和數據框n
R高級數據管理--重複和循環、條件執行、轉置n
R基本圖形--條形圖（堆砌、分組、均值）、條形圖的微調n
R的基本圖形——餅圖n
R基本圖形--直方圖n
R基本圖形--核密度圖n
R基本圖形——箱線圖n
R實例——預測海藻數量之問題描述與目標、數據集格式n
R實例——預測海藻數量之數據預處理n
R實例——預測海藻數量之獲取預測模型n
R實例——預測海藻數量之模型的精簡和調優