菜鳥學tensorflow.3

04-29

checkpoint

可以上手擼代碼，明白建立網路、訓練、評估測試的實現，常見模型：線性回歸模型、softmax應用到多分類模型。

接下來，實現卷積神經網路（常用於圖像處理領域），使用GPU版本的tensorflow

outline

`GPU版本的tensorflow

`卷積神經網路

`GPU版本的tensorflow

（其實，tensorflow中文社區的版本滯後於英文版）tensorflow現在已經能直接用pip安裝，而且速度很快。

pip install --upgrade pip

CPU版本

pip install tensorflow # Python 2.7; CPU support (no GPU support)

GPU版本

pip install tensorflow-gpu # Python 2.7; GPU support

當然你也可以用清華源

pip install -i Simple Index tensorflow

#測試是否已經正確安裝cpu版本

>>> import tensorflow as tf>>> hello = tf.constant(Hello, TensorFlow!) >>> sess = tf.Session() 2018-04-11 18:40:20.509133: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA>>> sess.run(hello)Hello, TensorFlow!>>> a = tf.constant(10) >>> b = tf.constant(20) >>> sess.run(a+b)30

#安裝GPU版本之前，需要先安裝cuda，下載cudnn的library，如果你不確定有沒有裝，可以先裝GPU版本試試，沒裝好，import就會報錯

否則tensorflow找不到相應的庫文件，會報下面類似的錯誤

ImportError: libcublas.so.8.0: cannot open shared object file: No such file or directory

Process

我的環境：tensorflow 1.7 cuda9 cudnn7.0

1、安裝cuda。tensorflow1.7需要cuda9.0（import的報錯信息是什麼就是缺什麼版本），NVIDIA網站下載相應cuda的run文件

chmod 777 cuda_9.1.85_387.26_linux.run
sudo sh cuda_9.1.85_387.26_linux.run

不要安裝他提供的顯卡 driver，兼容性不太好，很容易把驅動搞壞，導致循環登陸問題（循環登錄哦，非常牛逼，微笑，我選擇放棄掙扎直接重裝）；

安裝默認文件夾 /usr/local/cuda-9.0，並且會自動創建一個/usr/local/cuda的symbolic link，可以選擇不生成。

2、下載相應版本的cudnn。cudnn downloads，下載形如cuDNN v7.1.2 Library for Linux。

解壓之後，把相應的文件拷貝到cuda安裝目錄的相應文件夾下

tar xvzf xxxx.tgz
cp Downloads/cuda/include/cudnn.h cuda-9.0/include/
cp Downloads/cuda/lib64/libcudnn* cuda-9.0/lib64/

3、修改環境變數

vim向~/.bashrc添加下面export語句，保存之後執行source

export PYTHONPATH=$PYTHONPATH:/home/ceo1207/cuda-9.0/lib64

export PATH="$PATH:/home/ceo1207/cuda-9.0/bin"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/home/ceo1207/cuda-9.0/lib64:/home/ceo1207/cuda-9.0/extras/CUPTI/lib64"
export LIBRARY_PATH=$LIBRARY_PATH:/home/ceo1207/cuda-9.0/lib64
export CUDA_HOME=/home/ceo1207/cuda-9.0

4、一般到這個時候，再次import tensorflow就不會有問題了。但是，但是，說到這個我很氣，花了我大半天的時間去弄。我用的是pycharm中的python IDE，他通過桌面快捷啟動的時候，不會繼承bash的變數，所以修改.bashrc添加的環境變數並不會被啟用，所以import的時候，就是找不到cuda的lib，只能通過bash運行pycharmdir/bin/pycharm.sh才能正確繼承環境變數。

才能讓這樣的錯誤消除

ImportError: libcublas.so.8.0: cannot open shared object file: No such file or directory

另外，運行tf時，報錯，說明cudnn的版本不對,cudnn的版本需要跟source的版本一致

Loaded runtime CuDNN library: 7102 (compatibility version 7100) but source was compiled with 7005 (compatibility version 7000)

some notes

#如何查看tensorflow是否使用了顯卡加速？

查看運行時在console的顯示信息

successfully opened CUDA library libcublas.so locally（用了GPU版本）

運行會話時，設置輸出日誌，代碼如下：

tf.Session(config=tf.ConfigProto(log_device_placement=True))

日誌中你應該就能看到具體的某一個op會運行在cpu還是gpu

類如cpu:0 gpu：0,1,2這樣的標號

2018-04-11 09:50:12.907953: I tensorflow/core/common_runtime/http://placer.cc:884] mul: (Mul)/job:localhost/replica:0/task:0/device:CPU:0

#查看tf的版本號

tf.__path__

tf.__version__

#如何卸載cuda

cd /usr/local/cuda-8.0/bin

運行 uninstall 腳本

#如何卸載tf

pip uninstall tensorflow

pip uninstall tensorflow-gpu

選擇安裝制定版本的tf

pip install tensorflow==1.4

pip參數：

-U（升級 upgrade）

--user 安裝在用戶目錄下，這樣不需要root許可權，也能使用pip install

#如何安裝.deb

dpkg -i deb文件名

#安裝.whl

pip install xx.whl 如果已經安裝了低版本，需要添加 -U

`卷積神經網路

終於把GPU版本搞完了，這次來完成早就說好的卷積神經網路

現在應該能輕車熟路了，這次使用GPU，可以把循環迭代次數放在10w數量級，比之前的網路，多了建立卷積層和pooling層的部分。

1、確定輸入和ground truth

2、確定網路結構

3、確定loss和優化方法

4、評估測試

5、運行前，記得Variable需要init

import tensorflow as tfimport input_data# use conv layer to recognize hand-written numbersdef weightVariable(shape): init = tf.truncated_normal(shape, stddev=0.1) return tf.Variable(init)def biasVariable(shape): init = tf.constant(0.1,shape=shape) return tf.Variable(init)input = tf.placeholder(tf.float32, shape=[None, 784])truth = tf.placeholder(tf.float32, shape=[None, 10])# set up the network# conv1 variablefilter1 = weightVariable([5,5,1,32])# batchsize height weight channelsinputImage = tf.reshape(input, [-1, 28, 28, 1])conv1 = tf.nn.conv2d(inputImage, filter1, strides=[1,1,1,1], padding="SAME")conv1 = tf.nn.relu(conv1+biasVariable([32]))pool1 = tf.nn.max_pool(conv1,ksize=[1,2,2,1], strides=[1,2,2,1], padding="SAME")# conv2 Variablefilter2 = weightVariable([5,5,32,64])conv2 = tf.nn.conv2d(pool1, filter2, strides=[1,1,1,1], padding="SAME")conv2 = tf.nn.relu(conv2+biasVariable([64]))pool2 = tf.nn.max_pool(conv2, ksize=[1,2,2,1], strides=[1,2,2,1], padding="SAME")# fully connectedpool2Flat = tf.reshape(pool2, [-1, 7*7*64])w1 = weightVariable([7*7*64,1024])b1 = biasVariable([1024])fc1 = tf.nn.relu(tf.matmul(pool2Flat,w1)+b1)w2 = weightVariable([1024,10])b2 = biasVariable([10])fc2 = tf.nn.relu(tf.matmul(fc1,w2)+b2)output = tf.nn.softmax(fc2)# trainloss = -tf.reduce_sum(truth*tf.log(output))train = tf.train.GradientDescentOptimizer(0.01).minimize(loss)# testresult = tf.equal(tf.argmax(truth,1),tf.argmax(output,1))accuracy = tf.reduce_mean(tf.cast(result,tf.float32))sess = tf.InteractiveSession()init = tf.initialize_all_variables()sess.run(init)mnist = import tensorflow as tfimport input_data# use conv layer to recognize hand-written numbersdef weightVariable(shape): init = tf.truncated_normal(shape, stddev=0.1) return tf.Variable(init)def biasVariable(shape): init = tf.constant(0.1,shape=shape) return tf.Variable(init)input = tf.placeholder(tf.float32, shape=[None, 784])truth = tf.placeholder(tf.float32, shape=[None, 10])# set up the network# conv1 variablefilter1 = weightVariable([5,5,1,32])# batchsize height weight channelsinputImage = tf.reshape(input, [-1, 28, 28, 1])conv1 = tf.nn.conv2d(inputImage, filter1, strides=[1,1,1,1], padding="SAME")conv1 = tf.nn.relu(conv1+biasVariable([32]))pool1 = tf.nn.max_pool(conv1,ksize=[1,2,2,1], strides=[1,2,2,1], padding="SAME")# conv2 Variablefilter2 = weightVariable([5,5,32,64])conv2 = tf.nn.conv2d(pool1, filter2, strides=[1,1,1,1], padding="SAME")conv2 = tf.nn.relu(conv2+biasVariable([64]))pool2 = tf.nn.max_pool(conv2, ksize=[1,2,2,1], strides=[1,2,2,1], padding="SAME")# fully connectedpool2Flat = tf.reshape(pool2, [-1, 7*7*64])w1 = weightVariable([7*7*64,1024])b1 = biasVariable([1024])fc1 = tf.nn.relu(tf.matmul(pool2Flat,w1)+b1)dropPlace = tf.placeholder(tf.float32)fc1Drop = tf.nn.dropout(fc1, dropPlace)w2 = weightVariable([1024,10])b2 = biasVariable([10])fc2 = tf.nn.relu(tf.matmul(fc1Drop,w2)+b2)output = tf.nn.softmax(fc2)# trainloss = -tf.reduce_sum(truth*tf.log(output))train = tf.train.GradientDescentOptimizer(1e-4).minimize(loss)# testresult = tf.equal(tf.argmax(truth,1),tf.argmax(output,1))accuracy = tf.reduce_mean(tf.cast(result,tf.float32))sess = tf.InteractiveSession()init = tf.initialize_all_variables()sess.run(init)mnist = input_data.read_data_sets(data/, one_hot=True)for i in range(100000): batch = mnist.train.next_batch(50) sess.run(train, feed_dict={input:batch[0],truth:batch[1],dropPlace:0.5}) if i%100 == 0 : print sess.run(accuracy, feed_dict={input:batch[0],truth:batch[1],dropPlace:1.0})sess.close()input_data.read_data_sets(data/, one_hot=True)for i in range(100000): batch = mnist.train.next_batch(50) sess.run(train, feed_dict={input:batch[0],truth:batch[1]}) if i%100 == 0 : print sess.run(accuracy, feed_dict={input:batch[0],truth:batch[1]})sess.close()

#note

· 挑選合適步長很重要，太大容易越過局部最優，太小收斂太慢，也容易陷入局部最優

剛開始設了0.01，跑了10w次迭代，都一直是10-20%的準確率，設為1e-4才表現正常

· 網路不好，迭代再多次也沒用

· 沒有添加Dropout之前，測試評估基本就60-80%的準確率，添加之後，直接躍升到99%，dropout對防止模型過擬合幫助很大

附：最後版本