開発環境構築 #3 Docker で PyTorch

NVIDIA Container Toolkit をインストールする


docs.nvidia.com

$ curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

$ sudo sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/nvidia-container-toolkit.list

$ sudo apt-get update

$ sudo apt-get install -y nvidia-container-toolkit

# Dockerデーモンの再起動
$ sudo systemctl restart docker

# テスト
docker run --rm --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi


PyTorch NGC コンテナ の 場合
https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorchcatalog.ngc.nvidia.com

$ docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:22.04-py3


Docker のインスタンスでテスト

$ python
Python 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:10)
[GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> from __future__ import print_function
>>> import torch
>>> x = torch.rand(5, 3)
>>> print(x)
tensor([[0.9880, 0.3159, 0.7574],
        [0.1205, 0.4638, 0.8332],
        [0.1799, 0.0452, 0.4928],
        [0.1377, 0.9705, 0.1059],
        [0.9214, 0.8536, 0.7608]])
>>> print('CUDA:', torch.cuda.is_available())
CUDA: True
>>> print(torch.cuda.device_count())
1
>>> print(torch.cuda.get_device_name())
NVIDIA GeForce GTX 1650
>>> print(torch.cuda.get_device_capability())
(7, 5)