Linux服务器配置教程

¶概述

本文详细介绍了几种算法库在Linux环境（本文以Ubuntu为例）的安装过程以及常见问题的解决方法，包括OpenCV、Caffe、TensorFlow和PyTorch。

¶OpenCV源码安装

¶1、更新软件库

1 2	sudo apt-get update sudo apt-get upgrade

¶2、安装依赖项

复制执行以下命令：

sudo apt-get install --assume-yes build-essential cmake git
sudo apt-get install --assume-yes pkg-config unzip ffmpeg qtbase5-dev python-dev python3-dev python-numpy python3-numpy
sudo apt-get install --assume-yes libopencv-dev libgtk-3-dev libdc1394-22 libdc1394-22-dev libjpeg-dev libpng12-dev libtiff5-dev libjasper-dev
sudo apt-get install --assume-yes libavcodec-dev libavformat-dev libswscale-dev libxine2-dev libgstreamer0.10-dev libgstreamer-plugins-base0.10-dev
sudo apt-get install --assume-yes libv4l-dev libtbb-dev libfaac-dev libmp3lame-dev libopencore-amrnb-dev libopencore-amrwb-dev libtheora-dev
sudo apt-get install --assume-yes libvorbis-dev libxvidcore-dev v4l-utils python-vtk
sudo apt-get install --assume-yes liblapacke-dev libopenblas-dev checkinstall
sudo apt-get install --assume-yes libgdal-dev

¶3、安装Cuda和CuDNN

参考网上的教程The GPU support prerequisites

¶4、下载源码

去到自己指定的目录下，下载OpenCV3.2的源码：

1 2	cd ~ wget https://github.com/opencv/opencv/archive/3.2.0.zip

¶5、编译源码

首先解压源码：

1	unzip 3.2.0.zip

进入OpenCV目录，创建build文件夹，并开始编译：

cd opencv-3.2.0
mkdir build
cd build
cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D INSTALL_PYTHON_EXAMPLES=ON -D INSTALL_C_EXAMPLES=OFF -D PYTHON_EXCUTABLE=/usr/bin/python -D WITH_CUDA=ON -D WITH_CUBLAS=ON -D DCUDA_NVCC_FLAGS="-D_FORCE_INLINES" -D CUDA_ARCH_PTX="" -D WITH_TBB=ON -D WITH_V4L=ON -D WITH_OPENGL=ON -D BUILD_EXAMPLES=ON ..
make -j$(nproc)

¶6、编译安装

编译完成后即可安装

sudo make install -j$(nproc)
sudo /bin/bash -c 'echo "/usr/local/lib" > /etc/ld.so.conf.d/opencv.conf'
sudo ldconfig
sudo apt-get update

¶7、常见问题

编译过程中如果ippicv文件下载失败，可以自行下载，拷贝到opencv-3.2.0/3rdparty/ippicv/downloads/linux-808b791a6eac9ed78d32a7666804320e目录下

¶Caffe源码安装

¶1、下载Caffe源码

1 2	cd ~ git clone https://github.com/BVLC/caffe.git

¶2、安装依赖项

1
2
3

sudo apt-get install --assume-yes libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
sudo apt-get install --assume-yes --no-install-recommends libboost-all-dev
sudo apt-get install --assume-yes libgflags-dev libgoogle-glog-dev liblmdb-dev

¶3、安装BLAS

1	sudo apt-get install --assume-yes libatlas-base-dev

¶4、CMake编译

若未安装vim，则先安装vim：

1	sudo apt-get install --assume-yes vim

进入caffe目录，复制配置文件，并修改配置文件：

1
2
3

cd caffe
cp Makefile.config.example Makefile.config
vim Makefile.config

注意，Caffe的配置信息需要进行一定修改，否则会造成编译出错。

如果需要使用CuDNN，则取消USE_CUDNN := 1的注释
如果需要使用OpenCV3，则取消OPENCV_VERSION := 3的注释
如果需要使用Python，则取消WITH_PYTHON_LAYER := 1的注释
在Makefile.config的94行，需要进行如下修改，否则会有hdf5无法找到的错误：

1 2	--- INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include +++ INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial/

其中`---`代表删除本行，`+++`代表增加本行。

另外，Makefile也需要修改181行：

1	vim Makefile

1 2	--- LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_hl hdf5 +++ LIBRARIES += glog gflags protobuf boost_system boost_filesystem m hdf5_serial_hl hdf5_serial

修改完后即可用CMake编译：

mkdir build
cd build
cmake ..
make all -j$(nproc)
make install -j$(nproc)
make runtest -j$(nproc)

若需要在Python中调用，则执行：

1	make pycaffe -j$(nproc)

最后还需要添加路径到PYTHONPATH中，打开.bashrc文件:

1	vim ~/.bashrc

在.bashrc的最后一行添加实际的caffe/python的路径，（注意：需要按照实际情况修改路径）：

1	export PYTHONPATH="$PYTHONPATH:/<your/path>/caffe/python"

并让其生效：

1	source ~/.bashrc

现在便可以在Python中import caffe，测试能否成功。

1 2	python import caffe

¶5、常见问题

ImportError: dynamic module does not define init function (init_caffe)

当前Python版本和编译时Python版本不一致，需要一致使用Python2或者Python3
F0108 20:21:26.907105 15849 pooling_layer.cu:212] Check failed: error == cudaSuccess (8 vs. 0) invalid device function

编译配置中算力选择与当前设备算力不一致，参考官方显卡算力表

¶TensorFlow虚拟环境安装

¶1、下载指定版本源码

在自定义目录中下载源码

cd ~
mkdir virtual-env
cd virtual-env
git clone https://github.com/tensorflow/tensorflow

选择指定版本，比如r1.0：

1 2	cd tensorflow git checkout r1.0

¶2、安装依赖项

1 2	sudo apt-get install python-pip python-dev python-virtualenv # for Python 2.7 sudo apt-get install python3-pip python3-dev python-virtualenv # for Python 3.n

¶3、创建虚拟环境

1 2	virtualenv --system-site-packages ~/virtual-env/tensorflow # for Python 2.7 virtualenv --system-site-packages -p python3 ~/virtual-env/tensorflow # for Python 3.n

¶4、激活虚拟环境

1	source ~/virtual-env/tensorflow/bin/activate

¶5、安装TensorFlow

Python2.7 CPU only

1	pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.0.1-cp27-none-linux_x86_64.whl

Python2.7 GPU

1	pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.0.1-cp27-none-linux_x86_64.whl

Python3.4 CPU only

1	pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.0.1-cp34-cp34m-linux_x86_64.whl

Python3.4 GPU

1	pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.0.1-cp34-cp34m-linux_x86_64.whl

Python3.5 CPU only

1	pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.0.1-cp35-cp35m-linux_x86_64.whl

Python3.5 GPU

1	pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.0.1-cp35-cp35m-linux_x86_64.whl

¶6、验证安装结果

TensorFlow已经安装完毕，现在可以测试安装是否成功。
首先要注意退出tensorflow的目录，否则会出现ImportError: cannot import name pywrap_tensorflow的错误。

1 2	cd ~ python

输入以下Python代码：

import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))

¶TensorFlow源码安装

¶1、下载指定版本源码

1	git clone https://github.com/tensorflow/tensorflow

选择指定版本，比如r1.0：

1 2	cd tensorflow git checkout r1.0

¶2、安装Bazel

安装JDK8

1	sudo apt-get install --assume-yes openjdk-8-jdk

若未安装curl，则先安装

1	sudo apt-get install --assume-yes curl

添加包源

1 2	echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" \| sudo tee /etc/apt/sources.list.d/bazel.list curl https://bazel.build/bazel-release.pub.gpg \| sudo apt-key add -

更新并安装Bazel

1	sudo apt-get update && sudo apt-get install --assume-yes bazel

¶3、安装Python依赖项

1	sudo apt-get install --assume-yes python-numpy python-dev python-pip python-wheel

¶4、安装GPU依赖项

1	sudo apt-get install --assume-yes libcupti-dev

¶5、配置安装选项

假设目前还在tensorflow目录下，执行：

1	./configure

即可开始配置参数，常见的配置例子如下：

Please specify the location of python. [Default is /usr/bin/python]: /usr/bin/python2.7
Found possible Python library paths:
  /usr/local/lib/python2.7/dist-packages
  /usr/lib/python2.7/dist-packages
Please input the desired Python library path to use.  Default is [/usr/lib/python2.7/dist-packages]

Using python library path: /usr/local/lib/python2.7/dist-packages
Do you wish to build TensorFlow with MKL support? [y/N]
No MKL support will be enabled for TensorFlow
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:
Do you wish to use jemalloc as the malloc implementation? [Y/n]
jemalloc enabled
Do you wish to build TensorFlow with Google Cloud Platform support? [y/N]
No Google Cloud Platform support will be enabled for TensorFlow
Do you wish to build TensorFlow with Hadoop File System support? [y/N]
No Hadoop File System support will be enabled for TensorFlow
Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N]
No XLA support will be enabled for TensorFlow
Do you wish to build TensorFlow with VERBS support? [y/N]
No VERBS support will be enabled for TensorFlow
Do you wish to build TensorFlow with OpenCL support? [y/N]
No OpenCL support will be enabled for TensorFlow
Do you wish to build TensorFlow with CUDA support? [y/N] Y
CUDA support will be enabled for TensorFlow
Do you want to use clang as CUDA compiler? [y/N]
nvcc will be used as CUDA compiler
Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 8.0]:
Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 6.0]: 
Please specify the location where cuDNN 6 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:
Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size.
[Default is: "3.5,5.2"]:
Do you wish to build TensorFlow with MPI support? [y/N] 
MPI support will not be enabled for TensorFlow
Configuration finished

注意CUDA默认为否，需要用CUDA时，请输入Y。

¶6、生成.whl包

CPU版本：

1	bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package

GPU版本：

1	bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

执行如下命令，即可在/tmp/tensorflow_pkg目录下生成.whl文件

1	bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg

¶7、使用pip安装.whl包

假设安装的是1.0.1版本：（注意：如果版本不一样，如下命令需要修改相应文件名）

1	sudo pip install /tmp/tensorflow_pkg/tensorflow-1.0.1-cp27-cp27mu-linux_x86_64.whl

¶8、验证安装结果

TensorFlow已经安装完毕，现在可以测试安装是否成功。
首先要注意退出tensorflow的目录，否则会出现ImportError: cannot import name pywrap_tensorflow的错误。

1 2	cd ~ python

输入以下Python代码：

import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))

如果能输出Hello, TensorFlow!，则说明安装成功！

¶9、常见问题

错误信息：

1	ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory

临时解决办法：
原因是找不到cuda库，需要指定目录：

1 2	export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64" sudo ldconfig

永久解决办法
打开~/.bashrc
1
vim ~/.bashrc
在最后一行添加：
1
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
保存后，让其生效：
1
source ~/.bashrc
以后每次登陆都会自动执行一次export指令，保证路径永久有效。

¶PyTorch在线安装

¶1、安装指令获取

安装命令可以在官网根据需要选择不同的安装方式

¶2、常见问题

如果import torch时报错
F0822 15:58:51.328765 31898 cudnn.hpp:113] Check failed: status == CUDNN_STATUS_SUCCESS (3 vs. 0) CUDNN_STATUS_BAD_PARAM

安装如下软件可能可以解决问题：
1
sudo apt-get install libtcmalloc-minimal4
并设置好路径：
1
vim ~/.bashrc
在.bashrc最后一行添加：
1
export LD_PRELOAD="$LD_PRELOAD:/usr/lib/libtcmalloc_minimal.so.4"
***/torch/lib/libtorch.so.1: undefined symbol: _ZTIN2at10TensorImplE

可能是和caffe2版本不一致，删除/usr/local/lib下的libcaffe*文件即可