CUDA 9.0 cuDNN 7.0 Tensorflow原始碼編譯

NO IMAGE

手欠把CUDA升級到了9.0,然後發現cuDNN必須升級到7.0才支援。於是順手把cuDNN升級到了7.0。然後發現在Python中匯入Tensorflow報錯。一查才知道tensorflow 1.3只支援CUDA8.0cuDNN6.0.想把CUDA和cuDNN降級回去,卻發現Nvidia官網6.0版本的cuDNN下載不下來了。悲催。
這時候終於理解了為何會有原始碼編譯安裝這種麻煩的方式–就是為了給我這種手欠的人一條活路。
沒有查到官方編譯安裝的教程。從網上搜到了一些,我參考的是TensorFlow學習一:原始碼安裝

克隆tensorflow原始碼

git clone --recurse-submodules https://github.com/tensorflow/tensorflow

安裝Bazel

按照官方教程安裝


我的CUDA和cuDNN是裝好的,因此安裝好Bazel就可以開始配置和編譯tensorflow了。

配置tensorflow

下面是我配置tensorflow的log。

[email protected]:~tensorflow$ ./configure
You have bazel 0.6.1 installed.
Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python3
Found possible Python library paths:
/home/shengchun/tensorflow/models/
/bin
/usr/local/cuda/bin
/sbin
/home/shengchun/tensorflow/models/slim
/usr/bin
/usr/local/sbin
/usr/local/games
/usr/lib/python3/dist-packages
/usr/games
/home/shengchun/mxnet-ssd/mxnet/python
/usr/sbin
/usr/local/bin
/usr/local/lib/python3.5/dist-packages
Please input the desired Python library path to use. Default is [/home/shengchun/tensorflow/models/]/usr/local/lib/python3.5/dist-packages
Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: [Enter]
jemalloc as malloc support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: [Enter]
Google Cloud Platform support will be enabled for TensorFlow.
Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: [Enter]
Hadoop File System support will be enabled for TensorFlow.
Do you wish to build TensorFlow with XLA JIT support? [y/N]: [Enter]
No XLA JIT support will be enabled for TensorFlow.
Do you wish to build TensorFlow with GDR support? [y/N]: [Enter]
No GDR support will be enabled for TensorFlow.
Do you wish to build TensorFlow with VERBS support? [y/N]: [Enter]
No VERBS support will be enabled for TensorFlow.
Do you wish to build TensorFlow with OpenCL support? [y/N]: [Enter]
No OpenCL support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.
Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]: 9.0
Please specify the location where CUDA 9.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: [Enter]
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 6.0]: 7.0.3
Please specify the location where cuDNN 7.0.3 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:[Enter]
**Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 5.0]**[Enter]
Do you want to use clang as CUDA compiler? [y/N]: [Enter]
nvcc will be used as CUDA compiler.
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Do you wish to build TensorFlow with MPI support? [y/N]: [Enter]
No MPI support will be enabled for TensorFlow.
Please specify optimization flags to use during compilation when bazel option “–config=opt” is specified [Default is -march=native]: [Enter]
Add “–config=mkl” to your bazel command to build with MKL support.
Please note that MKL on MacOS or windows is still not supported.
If you would like to use a local MKL instead of downloading, please set the environment variable “TF_MKL_ROOT” every time before build.
Configuration finished

編譯tensorflow, 開啟 GPU 支援:

bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

整個過程還是比較順利的,只是編譯時間有點兒長。

生成whl檔案

bazel編譯命令建立了一個名為build_pip_package的指令碼。執行如下的命令將會在 /tmp/tensorflow_pkg路徑下生成一個.whl檔案:

bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

pip安裝tensorflow

sudo pip3 install /tmp/tensorflow_pkg/tensorflow-1.3.0-cp35-cp35m-linux_x86_64.whl

驗證安裝

開啟任意一個新的終端,注意不要在tensorflow的安裝路徑下,執行

python3

輸入以下程式碼

import tensorflow as tf
hello = tf.constant(‘Hello, TensorFlow!’)
sess = tf.Session()
print(sess.run(hello))

至此,tensorflow 1.3從原始碼編譯安裝成功,世界恢復了正常。