TensorFlow on a GTX 1080

Ubuntu 16.04 安裝 CUDA、NVIDIA驅動,CUDNN及GPU版TensorFlow。

GPU 支持的TensorFlow讓算力大幅提升,但是安裝好一切支持卻不那麼容易!其實主要是三個東西:

1. Nvidia 驅動:顯卡驅動

2. CUDA Toolkit CUDA工具箱

3. CUDNN:CUDA Deep Neural Network library 神經網路庫函數

依賴

$ sudo apt-get update$ sudo apt-get install freeglut3-dev g++-4.9 gcc-4.9 libglu1-mesa-dev libx11-dev libxi-dev libxmu-dev nvidia-modprobe python-dev python-pip python-virtualenv

安裝Nvidia驅動

$ sudo apt-get purge nvidia-* 刪除nvidia 之前的$ sudo add-apt-repository ppa:graphics-drivers/ppa$ sudo apt-get update$ sudo apt-get install nvidia-384

可在Proprietary GPU Drivers : 「Graphics Drivers」 team查看當前穩定版本Nvidia驅動,如筆者當前(2017-11-13)版本是『nvidia-384』。

接下來重啟$ sudo reboot

重啟後,檢測Nvidia驅動安裝情況,

$ cat /proc/driver/nvidia/versionNVRM version: NVIDIA UNIX x86_64 Kernel Module 384.98 Thu Oct 26 15:16:01 PDT 2017GCC version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)

顯示Nvidias system management interface:

$ sudo nvidia-smi

``` bash

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 384.98 Driver Version: 384.98 |

|-------------------------------+----------------------+----------------------+

| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |

| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |

|===============================+======================+======================|

| 0 GeForce GTX 1080 Off | 00000000:01:00.0 On | N/A |

| 0% 47C P8 12W / 215W | 7992MiB / 8112MiB | 2% Default |

+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+

| Processes: GPU Memory |

| GPU PID Type Process name Usage |

|=============================================================================|

| 0 994 G /usr/lib/xorg/Xorg 193MiB |

| 0 1889 G compiz 151MiB |

| 0 5068 C /home/frank/anaconda3/bin/python 7643MiB |

+-----------------------------------------------------------------------------+

設置GCC 4.9為默認``` bash$ sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 10$ sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 20$ sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 10$ sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 20

安裝CUDA

當前雖然CUDA-9.0已經發布,但是TensorFlow默認編譯版本還是基於CUDA-8.0的,我們在這裡CUDA Toolkit 8.0 - Feb 2017 | NVIDIA Developer下載runfile

使用如下安裝

sudo cuda_8.0.61_375.26_linux.run --override

安裝時記得

Do you accept the previously read EULA? (accept/decline/quit): acceptYou are attempting to install on an unsupported configuration. Do you wish to continue? ((y)es/(n)o) [ default is no ]: yesInstall NVIDIA Accelerated Graphics Driver for Linux-x86_64 352.39? ((y)es/(n)o/(q)uit): noInstall the CUDA 8.0 Toolkit? ((y)es/(n)o/(q)uit): yesEnter Toolkit Location [ default is /usr/local/cuda-8.0 ]:Do you want to install a symbolic link at /usr/local/cuda? ((y)es/(n)o/(q)uit): yesInstall the CUDA 8.0 Samples? ((y)es/(n)o/(q)uit): noInstalling the CUDA Toolkit in /usr/local/cuda-8.0 ...============ Summary =8.0===========Driver: Not SelectedToolkit: Installed in /usr/local/cuda-8.0Samples: Not SelectedPlease make sure that - PATH includes /usr/local/cuda-8.0/bin - LD_LIBRARY_PATH includes /usr/local/cuda-8.0/lib64, or, add /usr/local/cuda-8.0/lib64 to /etc/ld.so.conf and run ldconfig as rootTo uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-8.0/binTo uninstall the NVIDIA Driver, run nvidia-uninstallPlease see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-8.0/doc/pdf for detailed information on setting up CUDA.***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 352.00 is required for CUDA 8.0 functionality to work.To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file: sudo <CudaInstaller>.run -silent -driverLogfile is /tmp/cuda_install_14557.log

記得上面這裡也有個詢問你是否安裝Nvidia驅動的地方,因為我們前面已經安裝了最新的版本,這裡當然選擇no。

添加環境變數

$ echo export PATH=/usr/local/cuda/bin:$PATH >> ~/.bashrc$ echo export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH >> ~/.bashrc$ source ~/.bashrc

查看CUDA compiler

$ nvcc -Vnvcc: NVIDIA (R) Cuda compiler driverCopyright (c) 2005-2016 NVIDIA CorporationBuilt on Tue_Jan_10_13:22:03_CST_2017Cuda compilation tools, release 8.0, V8.0.61

安裝CUDA Deep Neural Network library :CUDNN

在此處下載cuDNN Download | NVIDIA Developer,可能需要我們註冊賬號登錄。

選擇適配CUDA的版本,以及cuDNN v7.0 Library for Linux,這個就是個targz文件。

接下來操作就是把cudnn的幾個庫放到cuda裡面:

$ tar xvf cudnn-8.0-linux-x64-v7.tgz$ sudo cp cuda/include/cudnn.h /usr/local/cuda/include/$ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/$ sudo chmod a+r /usr/local/cuda/lib64/libcudnn*

TensorFlow安裝

pip install --upgrade tfBinaryURL即可,這裡的tfBinaryURL可在Installing TensorFlow on Ubuntu | TensorFlow選取,例如我這裡選取Python3.6的GPU Support:

驗證TensorFlow安裝

In [1]: import tensorflow as tfIn [2]: sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))2017-11-13 18:54:59.081831: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA2017-11-13 18:54:59.186280: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero2017-11-13 18:54:59.186604: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties:name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.86pciBusID: 0000:01:00.0totalMemory: 7.92GiB freeMemory: 7.46GiB2017-11-13 18:54:59.186617: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)Device mapping:/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.12017-11-13 18:54:59.216573: I tensorflow/core/common_runtime/direct_session.cc:299] Device mapping:/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1

如上,列印出這些信息就證明安裝成功啦!

推薦閱讀:

TAG:深度學習DeepLearning | 機器學習 | TensorFlow |