基於遷移學習的樹葉分類

01-30

Matlab從r2017a後神經網路工具箱提供了一些Object Classification的網路的預訓練模型，可以通過附加功能管理器安裝。Mathworks提供了AlexNet、GoogleNet、VGG19、VGG16、Resnet50等模型的預訓練權重，這裡我是用了VGG19的模型。

這次使用的數據來自Leafsnap: An Electronic Field Guide Leafsnap的論文發表在ECCV2012，當時作者通過對葉片分割後根據形狀進行識別。

使用網路example代碼進行一個小實驗，圖片隨便從數據集里找了一個。

net = vgg19() nI = imread(13291664370044.jpg);nsz = net.Layers(1).InputSize nI = imresize(I,[sz(1),sz(2)]);nlabel = classify(net, I);nfigure; nimshow(I) ntext(10, 20, char(label),Color,white)n

識別為花瓶，因為數據集沒有這東西，所以識別錯誤很正常，而且本身葉片識別是屬於細粒度試別的應用，現在有RACNN之類更適合這工作的模型。

VGG19網路結構如下，其實VGG原文從11層到19層，帶不帶LRN、cnov1都有測試，VGG19就是論文中E配置網路。我們做訓練主要是對最後三層進行處理。

Leafsnap解壓數據集得到三項

需要的圖像在datasetimagesfield路徑下，有184種，每種有十幾到上百不等的圖片，圖片大小也不同，但是葉片中心都在圖像中心。整個訓練集有大約7K7數據。並不是很多的數據我會在這裡試著finetune幾層網路或者用最後幾層網路去鏈接一個SVM做一些實驗。

這是CS231n裡面的一張圖，告訴你遷移學習時候如何去規劃幹什麼。當然也可以自己多試幾次。

47x1 Layer array with layers:nn 1 input Image Input 224x224x3 images with zerocenter normalizationn 2 conv1_1 Convolution 64 3x3x3 convolutions with stride [1 1] and padding [1 1 1 1]n 3 relu1_1 ReLU ReLUn 4 conv1_2 Convolution 64 3x3x64 convolutions with stride [1 1] and padding [1 1 1 1]n 5 relu1_2 ReLU ReLUn 6 pool1 Max Pooling 2x2 max pooling with stride [2 2] and padding [0 0 0 0]n 7 conv2_1 Convolution 128 3x3x64 convolutions with stride [1 1] and padding [1 1 1 1]n 8 relu2_1 ReLU ReLUn 9 conv2_2 Convolution 128 3x3x128 convolutions with stride [1 1] and padding [1 1 1 1]n 10 relu2_2 ReLU ReLUn 11 pool2 Max Pooling 2x2 max pooling with stride [2 2] and padding [0 0 0 0]n 12 conv3_1 Convolution 256 3x3x128 convolutions with stride [1 1] and padding [1 1 1 1]n 13 relu3_1 ReLU ReLUn 14 conv3_2 Convolution 256 3x3x256 convolutions with stride [1 1] and padding [1 1 1 1]n 15 relu3_2 ReLU ReLUn 16 conv3_3 Convolution 256 3x3x256 convolutions with stride [1 1] and padding [1 1 1 1]n 17 relu3_3 ReLU ReLUn 18 conv3_4 Convolution 256 3x3x256 convolutions with stride [1 1] and padding [1 1 1 1]n 19 relu3_4 ReLU ReLUn 20 pool3 Max Pooling 2x2 max pooling with stride [2 2] and padding [0 0 0 0]n 21 conv4_1 Convolution 512 3x3x256 convolutions with stride [1 1] and padding [1 1 1 1]n 22 relu4_1 ReLU ReLUn 23 conv4_2 Convolution 512 3x3x512 convolutions with stride [1 1] and padding [1 1 1 1]n 24 relu4_2 ReLU ReLUn 25 conv4_3 Convolution 512 3x3x512 convolutions with stride [1 1] and padding [1 1 1 1]n 26 relu4_3 ReLU ReLUn 27 conv4_4 Convolution 512 3x3x512 convolutions with stride [1 1] and padding [1 1 1 1]n 28 relu4_4 ReLU ReLUn 29 pool4 Max Pooling 2x2 max pooling with stride [2 2] and padding [0 0 0 0]n 30 conv5_1 Convolution 512 3x3x512 convolutions with stride [1 1] and padding [1 1 1 1]n 31 relu5_1 ReLU ReLUn 32 conv5_2 Convolution 512 3x3x512 convolutions with stride [1 1] and padding [1 1 1 1]n 33 relu5_2 ReLU ReLUn 34 conv5_3 Convolution 512 3x3x512 convolutions with stride [1 1] and padding [1 1 1 1]n 35 relu5_3 ReLU ReLUn 36 conv5_4 Convolution 512 3x3x512 convolutions with stride [1 1] and padding [1 1 1 1]n 37 relu5_4 ReLU ReLUn 38 pool5 Max Pooling 2x2 max pooling with stride [2 2] and padding [0 0 0 0]n 39 fc6 Fully Connected 4096 fully connected layern 40 relu6 ReLU ReLUn 41 drop6 Dropout 50% dropoutn 42 fc7 Fully Connected 4096 fully connected layern 43 relu7 ReLU ReLUn 44 drop7 Dropout 50% dropoutn 45 fc8 Fully Connected 1000 fully connected layern 46 prob Softmax softmaxn 47 output Classification Output crossentropyex with tench, goldfish, and 998 other classesn

以上是Matlab裡面的VGG19的結構，一共47層，我們先試著修改最後兩層或者一層全連接層。

%Matlab r2017bnclear;nnet = vgg19;n%讀取數據與Resize預處理nimages = imageDatastore(datasetimagesfield,IncludeSubfolders,true,LabelSource,foldernames);ninputSize = net.Layers(1).InputSize(1:2);nimages.ReadFcn = @(loc)imresize(imread(loc),inputSize);n%分配訓練集和驗證集n[trainingImages,validationImages] = splitEachLabel(images,0.7,randomized);n%圖片數量與類目數量nnumTrainImages = numel(trainingImages.Labels);nnumValidationImages = numel(validationImages.Labels);nnumClasses = numel(categories(trainingImages.Labels));n%載入網路nlayersTransfer = net.Layers(1:end-3);n%新網路只是修改了最後一層FC層重新訓練nlayers = [n layersTransfern fullyConnectedLayer(numClasses,WeightLearnRateFactor,20,BiasLearnRateFactor,20)n softmaxLayern classificationLayer];n%訓練網路nminiBatchSize = 32;nnumIterationsPerEpoch = floor(numel(trainingImages.Labels)/miniBatchSize);noptions = trainingOptions(sgdm,...n MiniBatchSize,miniBatchSize,...n MaxEpochs,50,...n InitialLearnRate,1e-4,...n Verbose,false,...n Plots,training-progress,...n ValidationData,validationImages,...n ValidationFrequency,numIterationsPerEpoch);nnetTransfer = trainNetwork(trainingImages,layers,options);n

這是最基本的用法，做了個預處理，Resize圖像到224x224。這個網路是在我筆記本上訓練的(本子是i5 6600K+1070的，8G顯存沒法跑很複雜的東西)，所以BatchSize設為32，如果配置好可以跑更大的Size。

迭代十多次後準確率穩定在90%左右。還是挺靠譜的。當然我們可以再折騰點，重置最後兩層FC層，並訓練整個網路。不得不說Matlab搞這個東西雖然效率低，但是用起來挺方便。

%推薦Matlab r2017bnclear;nnet = vgg19;n%讀取數據與Resize預處理nimages = imageDatastore(datasetimagesfield,IncludeSubfolders,true,LabelSource,foldernames);ninputSize = net.Layers(1).InputSize(1:2);nimages.ReadFcn = @(loc)imresize(imread(loc),inputSize);n%分配訓練集和驗證集n[trainingImages,validationImages] = splitEachLabel(images,0.7,randomized);n%圖片數量與類目數量nnumTrainImages = numel(trainingImages.Labels);nnumValidationImages = numel(validationImages.Labels);nnumClasses = numel(categories(trainingImages.Labels));n%載入網路nlayersTransfer = net.Layers(1:end-6);n%新網路修改了最後兩層FC層重新訓練nlayers = [n layersTransfern fullyConnectedLayer(4096,WeightLearnRateFactor,20,BiasLearnRateFactor,20)n reluLayern dropoutLayer(0.5)n fullyConnectedLayer(numClasses,WeightLearnRateFactor,20,BiasLearnRateFactor,20)n softmaxLayern classificationLayer];n%訓練網路nminiBatchSize = 32;nnumIterationsPerEpoch = floor(numel(trainingImages.Labels)/miniBatchSize);noptions = trainingOptions(sgdm,...n MiniBatchSize,miniBatchSize,...n MaxEpochs,25,...n InitialLearnRate,1e-4,...n Verbose,false,...n Plots,training-progress,...n ValidationData,validationImages,...n ValidationFrequency,numIterationsPerEpoch);nnetTransfer = trainNetwork(trainingImages,layers,options);n

網路結構沒有變化，只是重置了最後兩個FC層。

參數量太多，訓練比較慢，就不進行更多訓練了，如果有興趣可以自己訓練，但是17M左右的參數學習起來實在費時間。我也不清楚訓練完是否會過擬合。

%Matlab r2017bnclear;nnet = vgg19;n%讀取數據與Resize預處理nimages = imageDatastore(datasetimagesfield,IncludeSubfolders,true,LabelSource,foldernames);ninputSize = net.Layers(1).InputSize(1:2);nimages.ReadFcn = @(loc)imresize(imread(loc),inputSize);n%分配訓練集和驗證集n[trainingImages,validationImages] = splitEachLabel(images,0.7,randomized);ntrainingImages = shuffle(trainingImages);nvalidationImages=shuffle(validationImages);n%圖片數量與類目數量nnumTrainImages = numel(trainingImages.Labels);nnumValidationImages = numel(validationImages.Labels);nnumClasses = numel(categories(trainingImages.Labels));n%載入網路nlayersTransfer = net.Layers(1:end-6);n%新網路修改了最後兩層FC層重新訓練nlayers = [n layersTransfern fullyConnectedLayer(2000)n reluLayern dropoutLayer(0.5)n fullyConnectedLayer(numClasses,WeightLearnRateFactor,20,BiasLearnRateFactor,20)n softmaxLayern classificationLayer];n%訓練網路nminiBatchSize = 32;nnumIterationsPerEpoch = floor(numel(trainingImages.Labels)/miniBatchSize);noptions = trainingOptions(sgdm,...n MiniBatchSize,miniBatchSize,...n MaxEpochs,30,...n InitialLearnRate,1e-4,...n Verbose,false,...n Plots,training-progress,...n ValidationData,validationImages,...n ValidationFrequency,numIterationsPerEpoch);nnetTransfer = trainNetwork(trainingImages,layers,options);n

下面，我還會實驗另一種用預訓練權重的方法，把fc8和fc7當做特徵來用，訓練一個SVM做分類。使用最後兩層全連接層作為輸出訓練一個SVM。

%Matlab r2017bnclear;nnet = vgg19;n%讀取數據與Resize預處理nimages = imageDatastore(datasetimagesfield,IncludeSubfolders,true,LabelSource,foldernames);ninputSize = net.Layers(1).InputSize(1:2);nimages.ReadFcn = @(loc)imresize(imread(loc),inputSize);n%分配訓練集和驗證集n[trainingImages,validationImages] = splitEachLabel(images,0.7,randomized);n%圖片數量與類目數量nnumTrainImages = numel(trainingImages.Labels);nnumValidationImages = numel(validationImages.Labels);nnumClasses = numel(categories(trainingImages.Labels));n%載入特徵nlayer1 = fc8;nlayer2 = fc7;ntrainingFeatures1 = activations(net,trainingImages,layer1);nvalidationFeatures1 = activations(net,validationImages,layer1);ntrainingFeatures2 = activations(net,trainingImages,layer2);nvalidationFeatures2 = activations(net,validationImages,layer2);ntrainingFeatures = [trainingFeatures1 trainingFeatures2];nvalidationFeatures = [validationFeatures1 validationFeatures2];ntrainingLabels = trainingImages.Labels;nvalidationtLabels = validationImages.Labels;n%訓練SVMnclassifier = fitcecoc(trainingFeatures,trainingLabels);n%測試npredictedLabels = predict(classifier,validationFeatures);naccuracy = mean(predictedLabels == validationtLabels)n

由於訓練SVM是CPU上執行的，得到結果時間較長。最後得到的結果是78.3%。這裡我們可以使用Relu或者Softmax層輸出（prob&relu7）作為特徵或者使用核函數。也許不錯的效果（可能更好？）

這次實驗里並沒有什麼tune工作，如果你希望取得更好的效果可以考慮進行優化，或者對訓練集增廣等操作。當然，你也可以用GoogleNet或者Resnet50等進行上述操作。