弁財天

ゴフマン「専門家を信じるのではなく、自分自身で考えて判断せよ」

Fedora25にnvidia-docker環境を構築して評価ちうw update3

nVidiaのcudaのビルド環境を評価ちう。

参考→CUDA 8.0のインストール

NVIDIAのcuDNN(CUDA Deep Neural Network) library

# sh cuda_8.0.61_375.26_linux-run

.bash_profileを更新したり…

PATH=.
PATH=$PATH:/usr/local/cuda-8.0/bin
PATH=$PATH:…
LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64
export PATH LD_LIBRARY_PATH
nvidia-smiでcuda導入を確認。
[hoge@hoge1 ~]$ nvidia-smi
Mon Apr  3 11:38:55 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 378.13                 Driver Version: 378.13                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 710      Off  | 0000:01:00.0     N/A |                  N/A |
| 50%   44C    P8    N/A /  N/A |    110MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
+-----------------------------------------------------------------------------+
[hoge@hoge1 ~]$ 

 4月 06 13:18:05 hoge1.localdomain kernel: NVRM: API mismatch: the client has the version 378.13, but
                                         NVRM: this kernel module has the version 375.39.  Please
                                         NVRM: make sure that this kernel module and all NVIDIA driver
                                         NVRM: components have the same version.
cudaとドライバーのバージョンを合わせるために378.13から375.39にダウングレード。
[hoge@hoge1 TensorFlow]$ nvidia-docker run --rm nvidia/cuda nvidia-smi
nvidia-docker | 2017/04/06 13:39:04 Error: Cannot connect to the Docker daemon. Is the docker daemon running on this host?
[hoge@hoge1 TensorFlow]$
これは /var/run/docket.sockのパーミッションが原因。

[hoge@hoge1 TensorFlow]$ nvidia-docker run --rm nvidia/cuda nvidia-smi
Using default tag: latest
Trying to pull repository docker.io/nvidia/cuda ... 
sha256:0e620a100ed88c91b5f223002c9c1bd25a41201b6419f0cb8b6d71410dd24378: Pulling from docker.io/nvidia/cuda
d54efb8db41d: Pull complete 
f8b845f45a87: Pull complete 
e8db7bf7c39f: Pull complete 
9654c40e9079: Pull complete 
6d9ef359eaaa: Pull complete 
ef0dca220a40: Pull complete 
692847dfba92: Pull complete 
9c00c8d8d515: Pull complete 
71a5cca8a219: Pull complete 
1d0194759635: Pull complete 
Digest: sha256:0e620a100ed88c91b5f223002c9c1bd25a41201b6419f0cb8b6d71410dd24378
Status: Downloaded newer image for docker.io/nvidia/cuda:latest
Thu Apr  6 04:49:11 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 710      Off  | 0000:01:00.0     N/A |                  N/A |
| 50%   49C    P8    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
+-----------------------------------------------------------------------------+
[hoge@hoge1 TensorFlow]$ 
おー、その場でダウンロード・ビルドするのか。JavaのMavenみたいなw

cudaを導入するとサンプルプログラム(/usr/local/cuda/samples)が付属する。 それをコンパイルしようとすると以下のようなエラーになるかも。

cc1plus: エラー: unrecognized command line option "-fopenmp"
とか
/usr/lib/gcc/x86_64-redhat-linux/3.4.6/../../../../lib64/libGLU.so: `__cxa_throw_bad_array_new_length@CXXABI_1.3.8' に対する定義されていない参照です
/usr/lib/gcc/x86_64-redhat-linux/3.4.6/../../../../lib64/libGLU.so: `operator delete(void*, unsigned long)@CXXABI_1.3.9' に対する定義されていない参照です
とか。

Fedora25のgccは6.3.1でcudaが要求する4.x以下に対応できない。 互換コンパイラの

compat-gcc-34-c++-3.4.6-41.fc25.x86_64
compat-gcc-34-3.4.6-41.fc25.x86_64
compat-libstdc++-33-3.2.3-68.16.fc25.x86_64
を導入しても"-fopenmp"みたいな問題は解決できない。

さらに無理やりコンパイルできたとしても、リンクするライブラリがlib64/libGLU.soだったりすると

/usr/lib/gcc/x86_64-redhat-linux/3.4.6/../../../../lib64/libGLU.so: `__cxa_throw_bad_array_new_length@CXXABI_1.3.8' に対する定義されていない参照です
なリンカーのエラーになって実行ファイルを作成できない。
これをLD_LIBRARY_PATHなどで解決するのは不可能。
ライブラリセットをどこかにまとめて用意しないといけなくなる。

さて、こんなコテコテになってしまった、 ライブラリセットをまとめて用意しまければならない問題を解決するのがnvidia-dockerであるw。

ギフハブではないギットハブ→github.com/NVIDIA/nvidia-docker

nvidia-dockerはcentos7の.rpmタイプを導入した。
nvidia-dockerのサンプルDockerファイルはgit cloneでダウンロード
$ git clone https://github.com/NVIDIA/nvidia-docker.git

[hoge@hoge1 deviceQuery]$ nvidia-docker build -t local:deviceQuery .
Sending build context to Docker daemon 2.048 kB
Step 1 : FROM nvidia/cuda:8.0-devel-centos7
 ---> 28549766978e
Step 2 : RUN yum install -y         cuda-samples-$CUDA_PKG_VERSION &&     rm -rf /var/cache/yum/*
 ---> Running in 520f6a2a4cac
Loaded plugins: fastestmirror, ovl


 One of the configured repositories failed (Unknown),
 and yum doesn't have enough cached data to continue. At this point the only
 safe thing yum can do is fail. There are a few ways to work "fix" this:

     1. Contact the upstream for the repository and get them to fix the problem.

     2. Reconfigure the baseurl/etc. for the repository, to point to a working
        upstream. This is most often useful if you are using a newer
        distribution release than is supported by the repository (and the
        packages for the previous distribution release still work).

     3. Run the command with the repository temporarily disabled
            yum --disablerepo= ...

     4. Disable the repository permanently, so yum won't use it by default. Yum
        will then just ignore the repository until you permanently enable it
        again or use --enablerepo for temporary usage:

            yum-config-manager --disable 
        or
            subscription-manager repos --disable=

     5. Configure the failing repository to be skipped, if it is unavailable.
        Note that yum will try to contact the repo. when it runs most commands,
        so will have to try and fail each time (and thus. yum will be be much
        slower). If it is a very temporary problem though, this is often a nice
        compromise:

            yum-config-manager --save --setopt=.skip_if_unavailable=true

Cannot find a valid baseurl for repo: base/7/x86_64
Could not retrieve mirrorlist http://mirrorlist.centos.org/?release=7&arch=x86_64&repo=os&infra=container error was
14: curl#6 - "Could not resolve host: mirrorlist.centos.org; Name or service not known"
The command '/bin/sh -c yum install -y         cuda-samples-$CUDA_PKG_VERSION &&     rm -rf /var/cache/yum/*' returned a non-zero code: 1
[hoge@hoge1 deviceQuery]$ 
DNSを引くとこでエラー。dnsmasqが動作してるので/etc/resolv.confを
nameserver 0.0.0.0
から
nameserver 127.0.0.1
に修正。

nvidia-dockerはdocker0(172.17.0.1/16)のNATネットワークを使うので NATクライアントが外にアクセスできるよーに、あちこちネットワークを変えないといけない。 これが大変w

[hoge@hoge1 UnifiedMemoryStreams]$ make 
cc1plus: エラー: unrecognized command line option "-fopenmp"
-----------------------------------------------------------------------------------------------
WARNING - OpenMP is unable to compile
-----------------------------------------------------------------------------------------------
This CUDA Sample cannot be built if the OpenMP compiler is not set up correctly.
This will be a dry-run of the Makefile.
For more information on how to set up your environment to build and run this 
sample, please refer the CUDA Samples documentation and release notes
-----------------------------------------------------------------------------------------------
[@] /usr/local/cuda-8.0/bin/nvcc -ccbin g++ -I../../common/inc -m64 -Xcompiler -fopenmp -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o UnifiedMemoryStreams.o -c UnifiedMemoryStreams.cu
[@] /usr/local/cuda-8.0/bin/nvcc -ccbin g++ -m64 -Xcompiler -fopenmp -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o UnifiedMemoryStreams UnifiedMemoryStreams.o -lcublas
[@] mkdir -p ../../bin/x86_64/linux/release
[@] cp UnifiedMemoryStreams ../../bin/x86_64/linux/release
[hoge@hoge1 UnifiedMemoryStreams]$ pwd
/usr/local/cuda/samples/0_Simple/UnifiedMemoryStreams
[hoge@hoge1 UnifiedMemoryStreams]$ 
コンパイルオプションの問題もnVidiaが用意したcentos7のnvidia-docker環境なら問題なくビルドできる。
nvidia-docker/samples/centos-7/deviceQueryをUnifiedMemoryStreamsにコピー。
$ cp -r deviceQuery UnifiedMemoryStreams
$ cd UnifiedMemoryStreams
Dockerfileを編集。どっかーファイルかよ?土方のファイルみたいな響きw
[hoge@hoge1 UnifiedMemoryStreams]$ ls
Dockerfile
[hoge@hoge1 UnifiedMemoryStreams]$ cat Dockerfile 
FROM nvidia/cuda:8.0-devel-centos7

RUN yum install -y \
        cuda-samples-$CUDA_PKG_VERSION && \
    rm -rf /var/cache/yum/*

WORKDIR /usr/local/cuda/samples/0_Simple/UnifiedMemoryStreams
RUN make

CMD ./UnifiedMemoryStreams
[hoge@hoge1 UnifiedMemoryStreams]$ 
nvidia-docker環境でビルド。
[hoge@hoge1 UnifiedMemoryStreams]$ vi Dockerfile 
[hoge@hoge1 UnifiedMemoryStreams]$ nvidia-docker build -t local:UnifiedMemoryStreams .
Sending build context to Docker daemon 2.048 kB
Step 1 : FROM nvidia/cuda:8.0-devel-centos7
 ---> 28549766978e
Step 2 : RUN yum install -y         cuda-samples-$CUDA_PKG_VERSION &&     rm -rf /var/cache/yum/*
 ---> Using cache
 ---> 7cfb41382a42
Step 3 : WORKDIR /usr/local/cuda/samples/0_Simple/UnifiedMemoryStreams
 ---> Running in cebd43a1ae97
 ---> cee7e0a2eb0c
Removing intermediate container cebd43a1ae97
Step 4 : RUN make
 ---> Running in 27d5d9b15f92
/usr/local/cuda-8.0/bin/nvcc -ccbin g++ -I../../common/inc  -m64    -Xcompiler -fopenmp -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o UnifiedMemoryStreams.o -c UnifiedMemoryStreams.cu
/usr/local/cuda-8.0/bin/nvcc -ccbin g++   -m64    -Xcompiler -fopenmp   -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_60,code=compute_60 -o UnifiedMemoryStreams UnifiedMemoryStreams.o  -lcublas
mkdir -p ../../bin/x86_64/linux/release
cp UnifiedMemoryStreams ../../bin/x86_64/linux/release
 ---> 6b9ed12cadc1
Removing intermediate container 27d5d9b15f92
Step 5 : CMD ./UnifiedMemoryStreams
 ---> Running in ac3d4d6d1194
 ---> 9a6a09de50c4
Removing intermediate container ac3d4d6d1194
Successfully built 9a6a09de50c4
[hoge@hoge1 UnifiedMemoryStreams]$
nvidia-docker環境で実行。
[hoge@hoge1 UnifiedMemoryStreams]$ nvidia-docker run -t local:UnifiedMemoryStreams
GPU Device 0: "GeForce GT 710" with compute capability 3.5

Executing tasks on host / device
Task [0], thread [0] executing on device (728)
Task [1], thread [1] executing on device (750)
Task [2], thread [3] executing on device (364)
Task [3], thread [2] executing on device (242)
Task [4], thread [0] executing on device (384)
Task [5], thread [1] executing on device (413)
Task [6], thread [1] executing on host (64)
Task [7], thread [1] executing on device (444)
Task [8], thread [1] executing on device (460)
Task [9], thread [1] executing on device (651)
Task [10], thread [1] executing on device (398)
Task [11], thread [1] executing on host (64)
Task [12], thread [1] executing on device (155)
Task [13], thread [2] executing on device (462)
Task [14], thread [0] executing on host (64)
Task [15], thread [0] executing on device (838)
Task [16], thread [0] executing on device (992)
Task [17], thread [0] executing on host (70)
Task [18], thread [0] executing on device (365)
Task [19], thread [0] executing on device (175)
Task [20], thread [0] executing on device (139)
Task [21], thread [0] executing on device (895)
Task [22], thread [0] executing on device (839)
Task [23], thread [0] executing on device (275)
Task [24], thread [0] executing on device (637)
Task [25], thread [0] executing on device (385)
Task [26], thread [0] executing on host (64)
Task [27], thread [0] executing on device (244)
Task [28], thread [0] executing on device (526)
Task [29], thread [0] executing on device (203)
Task [30], thread [0] executing on device (519)
Task [31], thread [0] executing on device (186)
Task [32], thread [0] executing on device (726)
Task [33], thread [0] executing on device (670)
Task [34], thread [0] executing on device (230)
Task [35], thread [0] executing on device (280)
Task [36], thread [0] executing on device (407)
Task [37], thread [0] executing on device (878)
Task [38], thread [1] executing on host (95)
Task [39], thread [1] executing on device (261)
All Done!
[hoge@hoge1 UnifiedMemoryStreams]$ 

docker環境にssh接続ではないコマンドプロンプト接続

$ cat simple/Dockerfile 
FROM nvidia/cuda:8.0-devel-centos7

WORKDIR /somwehere/nvidia-docker/samples/centos-7/simple

#CMD ["cat", "/etc/resolv.conf"]
$ 
ビルドしてCMD文で実行するのはかったるい。CMDをコメントアウトするとシェルでdocker環境をのぞける。
$ nvidia-docker build -t local:simple simple
Sending build context to Docker daemon 2.048 kB
Step 1 : FROM nvidia/cuda:8.0-devel-centos7
 ---> f72653fd14d3
Step 2 : WORKDIR /somwehere//nvidia-docker/samples/centos-7/simple
 ---> Using cache
 ---> c90d4f99f1e5
Successfully built c90d4f99f1e5
$ nvidia-docker run -ti --rm -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -t local:simple
[root@79c70351595f simple]# ←dockerのcentos-7のコマンドプロンプトw
[root@79c70351595f simple]# hostname
79c70351595f
[root@79c70351595f simple]# exit ←Ctrl+Dで抜ける。
$ ←Fedora25のプロンプトw
dockerにDISPLAY環境変数を連携してるのでxtermみたいなX11を使うプログラムをyum install xtermして実行することも可能w。
しかもこの仮想環境は軽い。


https://cloud.githubusercontent.com/assets/3028125/12213714/5b208976-b632-11e5-8406-38d379ec46aa.png
nvidia-dockerは 最新のグラフィクスボードを搭載したnVidiaの環境で、 互換性を維持して古いcudaプログラムのビルドと実行を可能にする。

DeepLearning BOX

http://www.univpc.com/img/upload/images/nnn.jpg
ぐはは。なんだこれ。GTXを4枚差せるのか。ありえねー。
これ使ったら、ビットコインのマイニングとか、ブルートフォースなレインボウ・アタックでパスワードをクラックするとか、JOGOと遥菜のイスラム国™のクビ切りCG動画作成とか、ボコハラムの違法臓器売買用の遺伝子解析とか、何でもやれそーw。科警研(科捜研)のスーパーピーポくん御用達だわw。

nVIDIAがFedora用リポジトリを公開してるのでFedora26で検証。
fedora-nvidia.repo
gcc49
nvcc改造

#!/bin/sh
echo "$@" >&2
args="$(echo "$@" | sed -e 's;-ccbin [^ ]* ;;') -ccbin /opt/gcc-4.9.3/bin/c++"
/usr/bin/nvcc.0 $args
export PATH=/opt/gcc-4.9.3/bin:$PATH
Makefile手メンテ
# internal flags
NVCCFLAGS   := -m${TARGET_SIZE}
CCFLAGS     :=
LDFLAGS     := -L /opt/gcc-4.9.3/lib64/gcc/x86_64-fedoraunited-linux-gnu/lib64

In-Q-Tel(CIA)がDockerに出資してるしw。

投稿されたコメント:

コメント
コメントは無効になっています。