機器學習進階筆記之四 | 深入理解GoogLeNet

引言

TensorFlow是Google基於DistBelief進行研發的第二代人工智慧學習系統,被廣泛用於語音識別或圖像識別等多項機器深度學習領域。其命名來源於本身的運行原理。Tensor(張量)意味著N維數組,Flow(流)意味著基於數據流圖的計算,TensorFlow代表著張量從圖象的一端流動到另一端計算過程,是將複雜的數據結構傳輸至人工智慧神經網中進行分析和處理的過程。

TensorFlow完全開源,任何人都可以使用。可在小到一部智能手機、大到數千台數據中心伺服器的各種設備上運行。

『機器學習進階筆記』系列將深入解析TensorFlow系統的技術實踐,從零開始,由淺入深,與大家一起走上機器學習的進階之路。

GoogLeNet是ILSVRC 2014的冠軍,主要是直徑經典的LeNet-5演算法,主要是Google的team成員完成,paper見Going Deeper with Convolutions.相關工作主要包括LeNet-5、Gabor filters、Network-in-Network.Network-in-Network改進了傳統的CNN網路,採用少量的參數就輕鬆地擊敗了AlexNet網路,使用Network-in-Network的模型最後大小約為29MNetwork-in-Network caffe model.GoogLeNet借鑒了Network-in-Network的思想,下面會詳細講述下。

Network-in-Network

左邊是我們CNN的線性卷積層,一般來說線性卷積層用來提取線性可分的特徵,但所提取的特徵高度非線性時,我們需要更加多的filters來提取各種潛在的特徵,這樣就存在一個問題,filters太多,導致網路參數太多,網路過於複雜對於計算壓力太大。

文章主要從兩個方法來做了一些改良:1,卷積層的改進:MLPconv,在每個local部分進行比傳統卷積層複雜的計算,如上圖右,提高每一層卷積層對於複雜特徵的識別能力,這裡舉個不恰當的例子,傳統的CNN網路,每一層的卷積層相當於一個只會做單一任務,你必須要增加海量的filters來達到完成特定量類型的任務,而MLPconv的每層conv有更加大的能力,每一層能夠做多種不同類型的任務,在選擇filters時只需要很少量的部分;2,採用全局均值池化來解決傳統CNN網路中最後全連接層參數過於複雜的特點,而且全連接會造成網路的泛化能力差,Alexnet中有提高使用dropout來提高網路的泛化能力。

最後作者設計了一個4層的Network-in-network+全局均值池化層來做imagenet的分類問題.

class NiN(Network):n def setup(self):n (self.feed(data)n .conv(11, 11, 96, 4, 4, padding=VALID, name=conv1)n .conv(1, 1, 96, 1, 1, name=cccp1)n .conv(1, 1, 96, 1, 1, name=cccp2)n .max_pool(3, 3, 2, 2, name=pool1)n .conv(5, 5, 256, 1, 1, name=conv2)n .conv(1, 1, 256, 1, 1, name=cccp3)n .conv(1, 1, 256, 1, 1, name=cccp4)n .max_pool(3, 3, 2, 2, padding=VALID, name=pool2)n .conv(3, 3, 384, 1, 1, name=conv3)n .conv(1, 1, 384, 1, 1, name=cccp5)n .conv(1, 1, 384, 1, 1, name=cccp6)n .max_pool(3, 3, 2, 2, padding=VALID, name=pool3)n .conv(3, 3, 1024, 1, 1, name=conv4-1024)n .conv(1, 1, 1024, 1, 1, name=cccp7-1024)n .conv(1, 1, 1000, 1, 1, name=cccp8-1024)n .avg_pool(6, 6, 1, 1, padding=VALID, name=pool4)n .softmax(name=prob))n

網路基本結果如上,代碼見GitHub - ethereon/caffe-tensorflow: Caffe models in TensorFlow.

這裡因為我最近工作變動的問題,沒有了機器來跑一篇,也無法畫下基本的網路結構圖,之後我會補上。這裡指的提出的是中間cccp1和ccp2(cross channel pooling)等價於1*1kernel大小的卷積層。caffe中NIN的實現如下:

name: "nin_imagenet"n layers {n top: "data"n top: "label"n name: "data"n type: DATAn data_param {n source: "/home/linmin/IMAGENET-LMDB/imagenet-train-lmdb"n backend: LMDBn batch_size: 64n }n transform_param {n crop_size: 224n mirror: truen mean_file: "/home/linmin/IMAGENET-LMDB/imagenet-train-mean"n }n include: { phase: TRAIN }n }n layers {n top: "data"n top: "label"n name: "data"n type: DATAn data_param {n source: "/home/linmin/IMAGENET-LMDB/imagenet-val-lmdb"n backend: LMDBn batch_size: 89n }n transform_param {n crop_size: 224n mirror: falsen mean_file: "/home/linmin/IMAGENET-LMDB/imagenet-train-mean"n }n include: { phase: TEST }n }n layers {n bottom: "data"n top: "conv1"n name: "conv1"n type: CONVOLUTIONn blobs_lr: 1n blobs_lr: 2n weight_decay: 1n weight_decay: 0n convolution_param {n num_output: 96n kernel_size: 11n stride: 4n weight_filler {n type: "gaussian"n mean: 0n std: 0.01n }n bias_filler {n type: "constant"n value: 0n }n }n }n layers {n bottom: "conv1"n top: "conv1"n name: "relu0"n type: RELUn }n layers {n bottom: "conv1"n top: "cccp1"n name: "cccp1"n type: CONVOLUTIONn blobs_lr: 1n blobs_lr: 2n weight_decay: 1n weight_decay: 0n convolution_param {n num_output: 96n kernel_size: 1n stride: 1n weight_filler {n type: "gaussian"n mean: 0n std: 0.05n }n bias_filler {n type: "constant"n value: 0n }n }n }n layers {n bottom: "cccp1"n top: "cccp1"n name: "relu1"n type: RELUn }n layers {n bottom: "cccp1"n top: "cccp2"n name: "cccp2"n type: CONVOLUTIONn blobs_lr: 1n blobs_lr: 2n weight_decay: 1n weight_decay: 0n convolution_param {n num_output: 96n kernel_size: 1n stride: 1n weight_filler {n type: "gaussian"n mean: 0n std: 0.05n }n bias_filler {n type: "constant"n value: 0n }n }n }n layers {n bottom: "cccp2"n top: "cccp2"n name: "relu2"n type: RELUn }n layers {n bottom: "cccp2"n top: "pool0"n name: "pool0"n type: POOLINGn pooling_param {n pool: MAXn kernel_size: 3n stride: 2n }n }n layers {n bottom: "pool0"n top: "conv2"n name: "conv2"n type: CONVOLUTIONn blobs_lr: 1n blobs_lr: 2n weight_decay: 1n weight_decay: 0n convolution_param {n num_output: 256n pad: 2n kernel_size: 5n stride: 1n weight_filler {n type: "gaussian"n mean: 0n std: 0.05n }n bias_filler {n type: "constant"n value: 0n }n }n }n layers {n bottom: "conv2"n top: "conv2"n name: "relu3"n type: RELUn }n layers {n bottom: "conv2"n top: "cccp3"n name: "cccp3"n type: CONVOLUTIONn blobs_lr: 1n blobs_lr: 2n weight_decay: 1n weight_decay: 0n convolution_param {n num_output: 256n kernel_size: 1n stride: 1n weight_filler {n type: "gaussian"n mean: 0n std: 0.05n }n bias_filler {n type: "constant"n value: 0n }n }n }n layers {n bottom: "cccp3"n top: "cccp3"n name: "relu5"n type: RELUn }n layers {n bottom: "cccp3"n top: "cccp4"n name: "cccp4"n type: CONVOLUTIONn blobs_lr: 1n blobs_lr: 2n weight_decay: 1n weight_decay: 0n convolution_param {n num_output: 256n kernel_size: 1n stride: 1n weight_filler {n type: "gaussian"n mean: 0n std: 0.05n }n bias_filler {n type: "constant"n value: 0n }n }n }n layers {n bottom: "cccp4"n top: "cccp4"n name: "relu6"n type: RELUn }n layers {n bottom: "cccp4"n top: "pool2"n name: "pool2"n type: POOLINGn pooling_param {n pool: MAXn kernel_size: 3n stride: 2n }n }n layers {n bottom: "pool2"n top: "conv3"n name: "conv3"n type: CONVOLUTIONn blobs_lr: 1n blobs_lr: 2n weight_decay: 1n weight_decay: 0n convolution_param {n num_output: 384n pad: 1n kernel_size: 3n stride: 1n weight_filler {n type: "gaussian"n mean: 0n std: 0.01n }n bias_filler {n type: "constant"n value: 0n }n }n }n layers {n bottom: "conv3"n top: "conv3"n name: "relu7"n type: RELUn }n layers {n bottom: "conv3"n top: "cccp5"n name: "cccp5"n type: CONVOLUTIONn blobs_lr: 1n blobs_lr: 2n weight_decay: 1n weight_decay: 0n convolution_param {n num_output: 384n kernel_size: 1n stride: 1n weight_filler {n type: "gaussian"n mean: 0n std: 0.05n }n bias_filler {n type: "constant"n value: 0n }n }n }n layers {n bottom: "cccp5"n top: "cccp5"n name: "relu8"n type: RELUn }n layers {n bottom: "cccp5"n top: "cccp6"n name: "cccp6"n type: CONVOLUTIONn blobs_lr: 1n blobs_lr: 2n weight_decay: 1n weight_decay: 0n convolution_param {n num_output: 384n kernel_size: 1n stride: 1n weight_filler {n type: "gaussian"n mean: 0n std: 0.05n }n bias_filler {n type: "constant"n value: 0n }n }n }n layers {n bottom: "cccp6"n top: "cccp6"n name: "relu9"n type: RELUn }n layers {n bottom: "cccp6"n top: "pool3"n name: "pool3"n type: POOLINGn pooling_param {n pool: MAXn kernel_size: 3n stride: 2n }n }n layers {n bottom: "pool3"n top: "pool3"n name: "drop"n type: DROPOUTn dropout_param {n dropout_ratio: 0.5n }n }n layers {n bottom: "pool3"n top: "conv4"n name: "conv4-1024"n type: CONVOLUTIONn blobs_lr: 1n blobs_lr: 2n weight_decay: 1n weight_decay: 0n convolution_param {n num_output: 1024n pad: 1n kernel_size: 3n stride: 1n weight_filler {n type: "gaussian"n mean: 0n std: 0.05n }n bias_filler {n type: "constant"n value: 0n }n }n }n layers {n bottom: "conv4"n top: "conv4"n name: "relu10"n type: RELUn }n layers {n bottom: "conv4"n top: "cccp7"n name: "cccp7-1024"n type: CONVOLUTIONn blobs_lr: 1n blobs_lr: 2n weight_decay: 1n weight_decay: 0n convolution_param {n num_output: 1024n kernel_size: 1n stride: 1n weight_filler {n type: "gaussian"n mean: 0n std: 0.05n }n bias_filler {n type: "constant"n value: 0n }n }n }n layers {n bottom: "cccp7"n top: "cccp7"n name: "relu11"n type: RELUn }n layers {n bottom: "cccp7"n top: "cccp8"n name: "cccp8-1024"n type: CONVOLUTIONn blobs_lr: 1n blobs_lr: 2n weight_decay: 1n weight_decay: 0n convolution_param {n num_output: 1000n kernel_size: 1n stride: 1n weight_filler {n type: "gaussian"n mean: 0n std: 0.01n }n bias_filler {n type: "constant"n value: 0n }n }n }n layers {n bottom: "cccp8"n top: "cccp8"n name: "relu12"n type: RELUn }n layers {n bottom: "cccp8"n top: "pool4"n name: "pool4"n type: POOLINGn pooling_param {n pool: AVEn kernel_size: 6n stride: 1n }n }n layers {n name: "accuracy"n type: ACCURACYn bottom: "pool4"n bottom: "label"n top: "accuracy"n include: { phase: TEST }n }n layers {n bottom: "pool4"n bottom: "label"n name: "loss"n type: SOFTMAX_LOSSn include: { phase: TRAIN }n }n

NIN的提出其實也可以認為我們加深了網路的深度,通過加深網路深度(增加單個NIN的特徵表示能力)以及將原先全連接層變為aver_pool層,大大減少了原先需要的filters數,減少了model的參數。paper中實驗證明達到Alexnet相同的性能,最終model大小僅為29M。

理解NIN之後,再來看GoogLeNet就不會有不明所理的感覺。

GoogLeNet

痛點

  • 越大的CNN網路,有更大的model參數,也需要更多的計算力支持,並且由於模型過於複雜會過擬合;
  • 在CNN中,網路的層數的增加會伴隨著需求計算資源的增加;
  • 稀疏的network是可以接受,但是稀疏的數據結構通常在計算時效率很低

Inception module

Inception module的提出主要考慮多個不同size的卷積核能夠hold圖像當中不同cluster的信息,為方便計算,paper中分別使用1*1,3*3,5*5,同時加入3*3 max pooling模塊。

然而這裡存在一個很大的計算隱患,每一層Inception module的輸出的filters將是分支所有filters數量的綜合,經過多層之後,最終model的數量將會變得巨大,naive的inception會對計算資源有更大的依賴。

前面我們有提到Network-in-Network模型,1*1的模型能夠有效進行降維(使用更少的來表達儘可能多的信息),所以文章提出了」Inception module with dimension reduction」,在不損失模型特徵表示能力的前提下,盡量減少filters的數量,達到減少model複雜度的目的:

Overall of GoogLeNet

在tensorflow構造GoogLeNet基本的代碼:

from kaffe.tensorflow import Network

class GoogleNet(Network):

def setup(self):n (self.feed(data)n .conv(7, 7, 64, 2, 2, name=conv1_7x7_s2)n .max_pool(3, 3, 2, 2, name=pool1_3x3_s2)n .lrn(2, 2e-05, 0.75, name=pool1_norm1)n .conv(1, 1, 64, 1, 1, name=conv2_3x3_reduce)n .conv(3, 3, 192, 1, 1, name=conv2_3x3)n .lrn(2, 2e-05, 0.75, name=conv2_norm2)n .max_pool(3, 3, 2, 2, name=pool2_3x3_s2)n .conv(1, 1, 64, 1, 1, name=inception_3a_1x1))nn (self.feed(pool2_3x3_s2)n .conv(1, 1, 96, 1, 1, name=inception_3a_3x3_reduce)n .conv(3, 3, 128, 1, 1, name=inception_3a_3x3))nn (self.feed(pool2_3x3_s2)n .conv(1, 1, 16, 1, 1, name=inception_3a_5x5_reduce)n .conv(5, 5, 32, 1, 1, name=inception_3a_5x5))nn (self.feed(pool2_3x3_s2)n .max_pool(3, 3, 1, 1, name=inception_3a_pool)n .conv(1, 1, 32, 1, 1, name=inception_3a_pool_proj))nn (self.feed(inception_3a_1x1,n inception_3a_3x3,n inception_3a_5x5,n inception_3a_pool_proj)n .concat(3, name=inception_3a_output)n .conv(1, 1, 128, 1, 1, name=inception_3b_1x1))nn (self.feed(inception_3a_output)n .conv(1, 1, 128, 1, 1, name=inception_3b_3x3_reduce)n .conv(3, 3, 192, 1, 1, name=inception_3b_3x3))nn (self.feed(inception_3a_output)n .conv(1, 1, 32, 1, 1, name=inception_3b_5x5_reduce)n .conv(5, 5, 96, 1, 1, name=inception_3b_5x5))nn (self.feed(inception_3a_output)n .max_pool(3, 3, 1, 1, name=inception_3b_pool)n .conv(1, 1, 64, 1, 1, name=inception_3b_pool_proj))nn (self.feed(inception_3b_1x1,n inception_3b_3x3,n inception_3b_5x5,n inception_3b_pool_proj)n .concat(3, name=inception_3b_output)n .max_pool(3, 3, 2, 2, name=pool3_3x3_s2)n .conv(1, 1, 192, 1, 1, name=inception_4a_1x1))nn (self.feed(pool3_3x3_s2)n .conv(1, 1, 96, 1, 1, name=inception_4a_3x3_reduce)n .conv(3, 3, 208, 1, 1, name=inception_4a_3x3))nn (self.feed(pool3_3x3_s2)n .conv(1, 1, 16, 1, 1, name=inception_4a_5x5_reduce)n .conv(5, 5, 48, 1, 1, name=inception_4a_5x5))nn (self.feed(pool3_3x3_s2)n .max_pool(3, 3, 1, 1, name=inception_4a_pool)n .conv(1, 1, 64, 1, 1, name=inception_4a_pool_proj))nn (self.feed(inception_4a_1x1,n inception_4a_3x3,n inception_4a_5x5,n inception_4a_pool_proj)n .concat(3, name=inception_4a_output)n .conv(1, 1, 160, 1, 1, name=inception_4b_1x1))nn (self.feed(inception_4a_output)n .conv(1, 1, 112, 1, 1, name=inception_4b_3x3_reduce)n .conv(3, 3, 224, 1, 1, name=inception_4b_3x3))nn (self.feed(inception_4a_output)n .conv(1, 1, 24, 1, 1, name=inception_4b_5x5_reduce)n .conv(5, 5, 64, 1, 1, name=inception_4b_5x5))nn (self.feed(inception_4a_output)n .max_pool(3, 3, 1, 1, name=inception_4b_pool)n .conv(1, 1, 64, 1, 1, name=inception_4b_pool_proj))nn (self.feed(inception_4b_1x1,n inception_4b_3x3,n inception_4b_5x5,n inception_4b_pool_proj)n .concat(3, name=inception_4b_output)n .conv(1, 1, 128, 1, 1, name=inception_4c_1x1))nn (self.feed(inception_4b_output)n .conv(1, 1, 128, 1, 1, name=inception_4c_3x3_reduce)n .conv(3, 3, 256, 1, 1, name=inception_4c_3x3))nn (self.feed(inception_4b_output)n .conv(1, 1, 24, 1, 1, name=inception_4c_5x5_reduce)n .conv(5, 5, 64, 1, 1, name=inception_4c_5x5))nn (self.feed(inception_4b_output)n .max_pool(3, 3, 1, 1, name=inception_4c_pool)n .conv(1, 1, 64, 1, 1, name=inception_4c_pool_proj))nn (self.feed(inception_4c_1x1,n inception_4c_3x3,n inception_4c_5x5,n inception_4c_pool_proj)n .concat(3, name=inception_4c_output)n .conv(1, 1, 112, 1, 1, name=inception_4d_1x1))nn (self.feed(inception_4c_output)n .conv(1, 1, 144, 1, 1, name=inception_4d_3x3_reduce)n .conv(3, 3, 288, 1, 1, name=inception_4d_3x3))nn (self.feed(inception_4c_output)n .conv(1, 1, 32, 1, 1, name=inception_4d_5x5_reduce)n .conv(5, 5, 64, 1, 1, name=inception_4d_5x5))nn (self.feed(inception_4c_output)n .max_pool(3, 3, 1, 1, name=inception_4d_pool)n .conv(1, 1, 64, 1, 1, name=inception_4d_pool_proj))nn (self.feed(inception_4d_1x1,n inception_4d_3x3,n inception_4d_5x5,n inception_4d_pool_proj)n .concat(3, name=inception_4d_output)n .conv(1, 1, 256, 1, 1, name=inception_4e_1x1))nn (self.feed(inception_4d_output)n .conv(1, 1, 160, 1, 1, name=inception_4e_3x3_reduce)n .conv(3, 3, 320, 1, 1, name=inception_4e_3x3))nn (self.feed(inception_4d_output)n .conv(1, 1, 32, 1, 1, name=inception_4e_5x5_reduce)n .conv(5, 5, 128, 1, 1, name=inception_4e_5x5))nn (self.feed(inception_4d_output)n .max_pool(3, 3, 1, 1, name=inception_4e_pool)n .conv(1, 1, 128, 1, 1, name=inception_4e_pool_proj))nn (self.feed(inception_4e_1x1,n inception_4e_3x3,n inception_4e_5x5,n inception_4e_pool_proj)n .concat(3, name=inception_4e_output)n .max_pool(3, 3, 2, 2, name=pool4_3x3_s2)n .conv(1, 1, 256, 1, 1, name=inception_5a_1x1))nn (self.feed(pool4_3x3_s2)n .conv(1, 1, 160, 1, 1, name=inception_5a_3x3_reduce)n .conv(3, 3, 320, 1, 1, name=inception_5a_3x3))nn (self.feed(pool4_3x3_s2)n .conv(1, 1, 32, 1, 1, name=inception_5a_5x5_reduce)n .conv(5, 5, 128, 1, 1, name=inception_5a_5x5))nn (self.feed(pool4_3x3_s2)n .max_pool(3, 3, 1, 1, name=inception_5a_pool)n .conv(1, 1, 128, 1, 1, name=inception_5a_pool_proj))nn (self.feed(inception_5a_1x1,n inception_5a_3x3,n inception_5a_5x5,n inception_5a_pool_proj)n .concat(3, name=inception_5a_output)n .conv(1, 1, 384, 1, 1, name=inception_5b_1x1))nn (self.feed(inception_5a_output)n .conv(1, 1, 192, 1, 1, name=inception_5b_3x3_reduce)n .conv(3, 3, 384, 1, 1, name=inception_5b_3x3))nn (self.feed(inception_5a_output)n .conv(1, 1, 48, 1, 1, name=inception_5b_5x5_reduce)n .conv(5, 5, 128, 1, 1, name=inception_5b_5x5))nn (self.feed(inception_5a_output)n .max_pool(3, 3, 1, 1, name=inception_5b_pool)n .conv(1, 1, 128, 1, 1, name=inception_5b_pool_proj))nn (self.feed(inception_5b_1x1,n inception_5b_3x3,n inception_5b_5x5,n inception_5b_pool_proj)n .concat(3, name=inception_5b_output)n .avg_pool(7, 7, 1, 1, padding=VALID, name=pool5_7x7_s1)n .fc(1000, relu=False, name=loss3_classifier)n .softmax(name=prob))n

代碼在GitHub - ethereon/caffe-tensorflow: Caffe models in TensorFlow中,作者封裝了一些基本的操作,了解網路結構之後,構造GoogLeNet很容易。之後等到新公司之後,我會試著在tflearn的基礎上寫下GoogLeNet的網路代碼。

GoogLeNet on Tensorflown

GoogLeNet為了實現方便,我用tflearn來重寫了下,代碼中和caffe model裡面不一樣的就是一些padding的位置,因為改的比較麻煩,必須保持inception部分的concat時要一致,我這裡也不知道怎麼修改pad的值(caffe prototxt),所以統一padding設定為same,具體代碼如下:

# -*- coding: utf-8 -*-nn """ GoogLeNet.n Applying GoogLeNet to Oxfords 17 Category Flower Dataset classification task.n References:n - Szegedy, Christian, et al.n Going deeper with convolutions.n - 17 Category Flower Dataset. Maria-Elena Nilsback and Andrew Zisserman.n Links:n - [GoogLeNet Paper](http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Szegedy_Going_Deeper_With_2015_CVPR_paper.pdf)n - [Flower Dataset (17)](http://www.robots.ox.ac.uk/~vgg/data/flowers/17/)n """nn from __future__ import division, print_function, absolute_importnn import tflearnn from tflearn.layers.core import input_data, dropout, fully_connectedn from tflearn.layers.conv import conv_2d, max_pool_2d, avg_pool_2dn from tflearn.layers.normalization import local_response_normalizationn from tflearn.layers.merge_ops import mergen from tflearn.layers.estimator import regressionnn import tflearn.datasets.oxflower17 as oxflower17n X, Y = oxflower17.load_data(one_hot=True, resize_pics=(227, 227))nnn network = input_data(shape=[None, 227, 227, 3])n conv1_7_7 = conv_2d(network, 64, 7, strides=2, activation=relu, name = conv1_7_7_s2)n pool1_3_3 = max_pool_2d(conv1_7_7, 3,strides=2)n pool1_3_3 = local_response_normalization(pool1_3_3)n conv2_3_3_reduce = conv_2d(pool1_3_3, 64,1, activation=relu,name = conv2_3_3_reduce)n conv2_3_3 = conv_2d(conv2_3_3_reduce, 192,3, activation=relu, name=conv2_3_3)n conv2_3_3 = local_response_normalization(conv2_3_3)n pool2_3_3 = max_pool_2d(conv2_3_3, kernel_size=3, strides=2, name=pool2_3_3_s2)n inception_3a_1_1 = conv_2d(pool2_3_3, 64, 1, activation=relu, name=inception_3a_1_1)n inception_3a_3_3_reduce = conv_2d(pool2_3_3, 96,1, activation=relu, name=inception_3a_3_3_reduce)n inception_3a_3_3 = conv_2d(inception_3a_3_3_reduce, 128,filter_size=3, activation=relu, name = inception_3a_3_3)n inception_3a_5_5_reduce = conv_2d(pool2_3_3,16, filter_size=1,activation=relu, name =inception_3a_5_5_reduce )n inception_3a_5_5 = conv_2d(inception_3a_5_5_reduce, 32, filter_size=5, activation=relu, name= inception_3a_5_5)n inception_3a_pool = max_pool_2d(pool2_3_3, kernel_size=3, strides=1, )n inception_3a_pool_1_1 = conv_2d(inception_3a_pool, 32, filter_size=1, activation=relu, name=inception_3a_pool_1_1)nn # merge the inception_3a__n inception_3a_output = merge([inception_3a_1_1, inception_3a_3_3, inception_3a_5_5, inception_3a_pool_1_1], mode=concat, axis=3)nn inception_3b_1_1 = conv_2d(inception_3a_output, 128,filter_size=1,activation=relu, name= inception_3b_1_1 )n inception_3b_3_3_reduce = conv_2d(inception_3a_output, 128, filter_size=1, activation=relu, name=inception_3b_3_3_reduce)n inception_3b_3_3 = conv_2d(inception_3b_3_3_reduce, 192, filter_size=3, activation=relu,name=inception_3b_3_3)n inception_3b_5_5_reduce = conv_2d(inception_3a_output, 32, filter_size=1, activation=relu, name = inception_3b_5_5_reduce)n inception_3b_5_5 = conv_2d(inception_3b_5_5_reduce, 96, filter_size=5, name = inception_3b_5_5)n inception_3b_pool = max_pool_2d(inception_3a_output, kernel_size=3, strides=1, name=inception_3b_pool)n inception_3b_pool_1_1 = conv_2d(inception_3b_pool, 64, filter_size=1,activation=relu, name=inception_3b_pool_1_1)nn #merge the inception_3b_*n inception_3b_output = merge([inception_3b_1_1, inception_3b_3_3, inception_3b_5_5, inception_3b_pool_1_1], mode=concat,axis=3,name=inception_3b_output)nn pool3_3_3 = max_pool_2d(inception_3b_output, kernel_size=3, strides=2, name=pool3_3_3)n inception_4a_1_1 = conv_2d(pool3_3_3, 192, filter_size=1, activation=relu, name=inception_4a_1_1)n inception_4a_3_3_reduce = conv_2d(pool3_3_3, 96, filter_size=1, activation=relu, name=inception_4a_3_3_reduce)n inception_4a_3_3 = conv_2d(inception_4a_3_3_reduce, 208, filter_size=3, activation=relu, name=inception_4a_3_3)n inception_4a_5_5_reduce = conv_2d(pool3_3_3, 16, filter_size=1, activation=relu, name=inception_4a_5_5_reduce)n inception_4a_5_5 = conv_2d(inception_4a_5_5_reduce, 48, filter_size=5, activation=relu, name=inception_4a_5_5)n inception_4a_pool = max_pool_2d(pool3_3_3, kernel_size=3, strides=1, name=inception_4a_pool)n inception_4a_pool_1_1 = conv_2d(inception_4a_pool, 64, filter_size=1, activation=relu, name=inception_4a_pool_1_1)nn inception_4a_output = merge([inception_4a_1_1, inception_4a_3_3, inception_4a_5_5, inception_4a_pool_1_1], mode=concat, axis=3, name=inception_4a_output)nnn inception_4b_1_1 = conv_2d(inception_4a_output, 160, filter_size=1, activation=relu, name=inception_4a_1_1)n inception_4b_3_3_reduce = conv_2d(inception_4a_output, 112, filter_size=1, activation=relu, name=inception_4b_3_3_reduce)n inception_4b_3_3 = conv_2d(inception_4b_3_3_reduce, 224, filter_size=3, activation=relu, name=inception_4b_3_3)n inception_4b_5_5_reduce = conv_2d(inception_4a_output, 24, filter_size=1, activation=relu, name=inception_4b_5_5_reduce)n inception_4b_5_5 = conv_2d(inception_4b_5_5_reduce, 64, filter_size=5, activation=relu, name=inception_4b_5_5)nn inception_4b_pool = max_pool_2d(inception_4a_output, kernel_size=3, strides=1, name=inception_4b_pool)n inception_4b_pool_1_1 = conv_2d(inception_4b_pool, 64, filter_size=1, activation=relu, name=inception_4b_pool_1_1)nn inception_4b_output = merge([inception_4b_1_1, inception_4b_3_3, inception_4b_5_5, inception_4b_pool_1_1], mode=concat, axis=3, name=inception_4b_output)nnn inception_4c_1_1 = conv_2d(inception_4b_output, 128, filter_size=1, activation=relu,name=inception_4c_1_1)n inception_4c_3_3_reduce = conv_2d(inception_4b_output, 128, filter_size=1, activation=relu, name=inception_4c_3_3_reduce)n inception_4c_3_3 = conv_2d(inception_4c_3_3_reduce, 256, filter_size=3, activation=relu, name=inception_4c_3_3)n inception_4c_5_5_reduce = conv_2d(inception_4b_output, 24, filter_size=1, activation=relu, name=inception_4c_5_5_reduce)n inception_4c_5_5 = conv_2d(inception_4c_5_5_reduce, 64, filter_size=5, activation=relu, name=inception_4c_5_5)nn inception_4c_pool = max_pool_2d(inception_4b_output, kernel_size=3, strides=1)n inception_4c_pool_1_1 = conv_2d(inception_4c_pool, 64, filter_size=1, activation=relu, name=inception_4c_pool_1_1)nn inception_4c_output = merge([inception_4c_1_1, inception_4c_3_3, inception_4c_5_5, inception_4c_pool_1_1], mode=concat, axis=3,name=inception_4c_output)nn inception_4d_1_1 = conv_2d(inception_4c_output, 112, filter_size=1, activation=relu, name=inception_4d_1_1)n inception_4d_3_3_reduce = conv_2d(inception_4c_output, 144, filter_size=1, activation=relu, name=inception_4d_3_3_reduce)n inception_4d_3_3 = conv_2d(inception_4d_3_3_reduce, 288, filter_size=3, activation=relu, name=inception_4d_3_3)n inception_4d_5_5_reduce = conv_2d(inception_4c_output, 32, filter_size=1, activation=relu, name=inception_4d_5_5_reduce)n inception_4d_5_5 = conv_2d(inception_4d_5_5_reduce, 64, filter_size=5, activation=relu, name=inception_4d_5_5)n inception_4d_pool = max_pool_2d(inception_4c_output, kernel_size=3, strides=1, name=inception_4d_pool)n inception_4d_pool_1_1 = conv_2d(inception_4d_pool, 64, filter_size=1, activation=relu, name=inception_4d_pool_1_1)nn inception_4d_output = merge([inception_4d_1_1, inception_4d_3_3, inception_4d_5_5, inception_4d_pool_1_1], mode=concat, axis=3, name=inception_4d_output)nn inception_4e_1_1 = conv_2d(inception_4d_output, 256, filter_size=1, activation=relu, name=inception_4e_1_1)n inception_4e_3_3_reduce = conv_2d(inception_4d_output, 160, filter_size=1, activation=relu, name=inception_4e_3_3_reduce)n inception_4e_3_3 = conv_2d(inception_4e_3_3_reduce, 320, filter_size=3, activation=relu, name=inception_4e_3_3)n inception_4e_5_5_reduce = conv_2d(inception_4d_output, 32, filter_size=1, activation=relu, name=inception_4e_5_5_reduce)n inception_4e_5_5 = conv_2d(inception_4e_5_5_reduce, 128, filter_size=5, activation=relu, name=inception_4e_5_5)n inception_4e_pool = max_pool_2d(inception_4d_output, kernel_size=3, strides=1, name=inception_4e_pool)n inception_4e_pool_1_1 = conv_2d(inception_4e_pool, 128, filter_size=1, activation=relu, name=inception_4e_pool_1_1)nnn inception_4e_output = merge([inception_4e_1_1, inception_4e_3_3, inception_4e_5_5,inception_4e_pool_1_1],axis=3, mode=concat)nn pool4_3_3 = max_pool_2d(inception_4e_output, kernel_size=3, strides=2, name=pool_3_3)nnn inception_5a_1_1 = conv_2d(pool4_3_3, 256, filter_size=1, activation=relu, name=inception_5a_1_1)n inception_5a_3_3_reduce = conv_2d(pool4_3_3, 160, filter_size=1, activation=relu, name=inception_5a_3_3_reduce)n inception_5a_3_3 = conv_2d(inception_5a_3_3_reduce, 320, filter_size=3, activation=relu, name=inception_5a_3_3)n inception_5a_5_5_reduce = conv_2d(pool4_3_3, 32, filter_size=1, activation=relu, name=inception_5a_5_5_reduce)n inception_5a_5_5 = conv_2d(inception_5a_5_5_reduce, 128, filter_size=5, activation=relu, name=inception_5a_5_5)n inception_5a_pool = max_pool_2d(pool4_3_3, kernel_size=3, strides=1, name=inception_5a_pool)n inception_5a_pool_1_1 = conv_2d(inception_5a_pool, 128, filter_size=1,activation=relu, name=inception_5a_pool_1_1)nn inception_5a_output = merge([inception_5a_1_1, inception_5a_3_3, inception_5a_5_5, inception_5a_pool_1_1], axis=3,mode=concat)nnn inception_5b_1_1 = conv_2d(inception_5a_output, 384, filter_size=1,activation=relu, name=inception_5b_1_1)n inception_5b_3_3_reduce = conv_2d(inception_5a_output, 192, filter_size=1, activation=relu, name=inception_5b_3_3_reduce)n inception_5b_3_3 = conv_2d(inception_5b_3_3_reduce, 384, filter_size=3,activation=relu, name=inception_5b_3_3)n inception_5b_5_5_reduce = conv_2d(inception_5a_output, 48, filter_size=1, activation=relu, name=inception_5b_5_5_reduce)n inception_5b_5_5 = conv_2d(inception_5b_5_5_reduce,128, filter_size=5, activation=relu, name=inception_5b_5_5 )n inception_5b_pool = max_pool_2d(inception_5a_output, kernel_size=3, strides=1, name=inception_5b_pool)n inception_5b_pool_1_1 = conv_2d(inception_5b_pool, 128, filter_size=1, activation=relu, name=inception_5b_pool_1_1)n inception_5b_output = merge([inception_5b_1_1, inception_5b_3_3, inception_5b_5_5, inception_5b_pool_1_1], axis=3, mode=concat)nn pool5_7_7 = avg_pool_2d(inception_5b_output, kernel_size=7, strides=1)n pool5_7_7 = dropout(pool5_7_7, 0.4)n loss = fully_connected(pool5_7_7, 17,activation=softmax)n network = regression(loss, optimizer=momentum,n loss=categorical_crossentropy,n learning_rate=0.001)n model = tflearn.DNN(network, checkpoint_path=model_googlenet,n max_checkpoints=1, tensorboard_verbose=2)n model.fit(X, Y, n_epoch=1000, validation_set=0.1, shuffle=True,n show_metric=True, batch_size=64, snapshot_step=200,n snapshot_epoch=False, run_id=googlenet_oxflowers17)n

大家如果感興趣,可以看看這部分的caffe model prototxt, 幫忙檢查下是否有問題,代碼我已經提交到tflearn的官方庫了,add GoogLeNet(Inception) in Example,各位有tensorflow的直接安裝下tflearn,看看是否能幫忙檢查下是否有問題,我這裡因為沒有GPU的機器,跑的比較慢,TensorBoard的圖如下,不像之前Alexnet那麼明顯(主要還是沒有跑那麼多epoch,這裡在寫入的時候發現主機上沒有磁碟空間了,尷尬,然後從新寫了restore來跑的,TensorBoard的圖也貌似除了點問題, 好像每次載入都不太一樣,但是從基本的log裡面的東西來看,是逐步在收斂的,這裡圖也貼下看看吧)

網路結構,也無法從TensorBoard上直接download下來,我這裡就一步步自己截的圖(勉強看看吧),好傻:

為了方便,這裡也貼出一些我自己保存的運行的log,能夠很明顯的看出收斂:

相關閱讀推薦:

機器學習進階筆記之三 | 深入理解Alexnet

機器學習進階筆記之二 | 深入理解Neural Style

機器學習進階筆記之一 | TensorFlow安裝與入門

本文由『UCloud內核與虛擬化研發團隊』提供。

關於作者:

Burness(@段石石 ), UCloud平台研發中心深度學習研發工程師,tflearn Contributor,做過電商推薦、精準化營銷相關演算法工作,專註於分散式深度學習框架、計算機視覺演算法研究,平時喜歡玩玩演算法,研究研究開源的項目,偶爾也會去一些數據比賽打打醬油,生活中是個極客,對新技術、新技能痴迷。

你可以在Github上找到他:hacker.duanshishi.com/

「UCloud機構號」將獨家分享雲計算領域的技術洞見、行業資訊以及一切你想知道的相關訊息。

歡迎提問&求關注 o(*////▽////*)q~

以上。


推薦閱讀:

如何改進手上的機器學習模型
Kaggle HousePrice : LB 0.11666(排名前15%), 用搭積木的方式(1.原理)
用深度學習檢測WAF惡意請求

TAG:云计算 | 机器学习 | 人工智能 |