GTX 1070 +CUDA 8.0 +cuDNN 5.1 + TensorFlow +Ubuntu 16.04 雙硬碟安裝記錄

02-09

由於實驗室的伺服器沒有顯卡，就只有在自己的電腦上面裝TensorFlow了

截止 2016年9月28日，TensorFlow的pip等安裝版本還沒有指出CUDA 8.0，所以只有採用源碼安裝。

安裝UBUNTU 16.04雙系統

Ubuntu的安裝方法採用了U盤安裝，網上有很多教程可以參考，如，安裝 Windows 10 和 Ubuntu 16.04 雙系統，需要注意的是，在選擇分區的時候，除去創建分區和home 分區以外，還需要創建一個efi分區，並將引導選擇在這個分區上就可實現雙引導。安裝完成後修改軟體源到TUNA。

安裝 CUDA 8.0

主要參考了深度學習開發環境配置：Ubuntu1 6.04+Nvidia GTX 1080+CUDA 8.0 - 靈魂機器 - 知乎專欄這篇文章。

其中需要注意的是，在安裝NVIDA驅動的時候，命令應該為：

sudo add-apt-repository ppa:graphics-drivers/ppasudo apt-get -qy updatesudo apt-get -qy install nvidia-370sudo apt-get -qy install mesa-common-devsudo apt-get -qy install freeglut3-devsudo reboot

安裝cuDNN 5.1

在官網https://developer.nvidia.com/cudnn下載對應的cuDNN，我在安裝的時候選擇的是5.1

tar -xzvf cudnn-8.0-linux-x64-v5.1-tgzcd cudnnsudo cp lib* /usr/local/cuda/lib64/sudo cp cudnn.h /usr/local/cuda/include/

注意選擇自己相應的的版本文件

安裝TensorFlow

花費最多時間和坑最多的就在安裝TensorFlow這一步了，由於偉大的牆的原因，如果你連接Github的速度較慢，建議掛上VPN或者代理。在安裝TensorFlow之前，我選擇了安裝Anaconda ，並使用了清華大學的conda和pipy軟體源最新方法可以參考：清華大學 TUNA 鏡像源

安裝Anaconda

Anaconda 是一個用於科學計算的 Python 發行版，支持 Linux, Mac, Windows, 包含了眾多流行的科學計算、數據分析的 Python 包。

Anaconda 安裝包可以到 Index of /anaconda/archive/ 下載。

要使用TUNA的鏡像源，需要如下命令：

conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/ conda config --set show_channel_urls yes

修改pip軟體源需要修改 ~/.pip/pip.conf(沒有就創建一個)文件，

[global]index-url = https://pypi.tuna.tsinghua.edu.cn/simple

通過源碼安裝TensorFlow

TensorFlow的安裝過程可以參考：谷歌全新開源人工智慧系統TensorFlow官方文檔中文版

英文版 :tensorflow/os_setup.md at master

由於更新較多，建議參考英文版本。採用源碼直接編譯安裝的方法安裝。

首先git源文件到本地（此步奏慢的傷心）：

git clone --recurse-submodules https://github.com/tensorflow/tensorflow

安裝JDK ，我安裝的時候開始安裝了OPEN-JDK 在後面有報錯，還是建議大家安裝標準的JDK

sudo add-apt-repository ppa:webupd8team/java sudo apt-get update sudo apt-get install oracle-java8-installer

更新下各種依賴

sudo apt-get install pkg-config zip g++ zlib1g-dev unzip

然後通過Releases · bazel build/bazel · GitHub下載最新的文BAZEL件進行安裝

chmod +x bazel-version-installer-os.sh ./bazel-version-installer-os.sh --user

安裝Tensorflow的依耐項，由於本文已經安裝了Anaconda，所以採用了 pip 來更新

pip install --upgrade swig numpy wheel

配置編譯文件

進入下載的tensorflow文件夾

$ ./configurePlease specify the location of python. [Default is /usr/bin/python]:Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] NNo Google Cloud Platform support will be enabled for TensorFlowDo you wish to build TensorFlow with GPU support? [y/N] yGPU support will be enabled for TensorFlowPlease specify which gcc nvcc should use as the host compiler. [Default is /usr/bin/gcc]:Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 8.0Please specify the location where CUDA 7.5 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: /usr/local/cuda-8.0Please specify the cuDNN version you want to use. [Leave empty to use system default]: 5.1.5Please specify the location where cuDNN 5 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda-8.0]: /usr/local/cudaPlease specify a list of comma-separated Cuda compute capabilities you want to build with.You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.Please note that each additional compute capability significantly increases your build time and binary size.[Default is: "3.5,5.2"]: 6.1Setting up Cuda includeSetting up Cuda libSetting up Cuda binSetting up Cuda nvvmSetting up CUPTI includeSetting up CUPTI lib64Configuration finished

注意前面的幾個問題，如python庫的位置和CUDA版本等信息，需要和自己的配置匹配。然後進行編譯

# GPU版本$ bazel build -c opt --config=cuda --local_resources 2048,.5,1.0 //tensorflow/tools/pip_package:build_pip_package

注意，由於我再這裡開始用官方的語言編譯導致多次報錯，所以採用了--local_resources 2048,.5,1.0 這一命令防止資源佔用過大，我不太懂後面參數的意義，估計第一個是內存，後面兩個沒懂就沒進行修改。這一編譯雖然慢很多，但至少能通過。

編譯完成之後，再使用

$ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg# The name of the .whl file will depend on your platform.$ sudo pip install /tmp/tensorflow_pkg/tensorflow-0.10.0-py2-none-any.whl

在安裝完成之後可以使用如下方法進行測試了

$ python...>>> import tensorflow as tf>>> hello = tf.constant(Hello, TensorFlow!)>>> sess = tf.Session()>>> print(sess.run(hello))Hello, TensorFlow!>>> a = tf.constant(10)>>> b = tf.constant(32)>>> print(sess.run(a + b))42>>>

或者

$ cd tensorflow/models/image/mnist$ python convolutional.pySuccessfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.Extracting data/train-images-idx3-ubyte.gzExtracting data/train-labels-idx1-ubyte.gzExtracting data/t10k-images-idx3-ubyte.gzExtracting data/t10k-labels-idx1-ubyte.gzInitialized!Epoch 0.00Minibatch loss: 12.054, learning rate: 0.010000Minibatch error: 90.6%Validation error: 84.6%Epoch 0.12Minibatch loss: 3.285, learning rate: 0.010000Minibatch error: 6.2%Validation error: 7.0%......

部分報錯參見官方文檔

部分參考：

Ubuntu 16.04安裝NVIDIA GTX 1070和TensorFlow指南

NVIDIA CuDNN 安裝說明

tensorflow/os_setup.md at master · tensorflow/tensorflow · GitHub

tensorflow-zh 官方文檔中文- 下載與安裝 · GitHub

Build Personal Deep Learning Rig: GTX 1080 + Ubuntu 16.04 + CUDA 8.0RC + CuDnn 7 + Tensorflow/Mxnet/Caffe/Darknet