Refer to the documentation of nvidia-container-runtime
- All the stable releases of
docker-ceinstalled from https://docs.docker.com/install/ - Edge releases are not supported.
- The legacy official package
docker-engine. - The package provided by Canonical:
docker.io. - The package provided by Red Hat:
docker.
The exact list depends on which upstream Docker versions are available for your Linux distribution.
You can look at the packages list on the gh-pages branch of the repository.
You must pin the versions of both nvidia-docker2 and nvidia-container-runtime when installing, for instance:
sudo apt-get install -y nvidia-docker2=2.0.3+docker18.09.1-1 nvidia-container-runtime=2.0.0+docker18.08.1-1
Use apt-cache madison nvidia-docker2 nvidia-container-runtime or yum search --showduplicates nvidia-docker2 nvidia-container-runtime to list the available versions.
Docker 1.12 which adds support for custom container runtimes.
The recommended way is to use your package manager and install the cuda-drivers package (or equivalent).
When no packages are available, you should use an official "runfile".
Alternatively, and as a technology preview, the NVIDIA driver can be deployed through a container.
Refer to the documentation for more information.
Yes, but packages nvidia-docker2 and nvidia-docker conflict. You need to install nvidia-container-runtime instead of nvidia-docker2 and register the new runtime manually.
Make sure the runtime was registered to dockerd. You also need to reload the configuration of the Docker daemon.
Your version of nvidia-container-runtime probably doesn't match your version of Docker. You need to pin the version of nvidia-container-runtime when installing the package.
Why do I get the error Depends: docker [...] but it is not installable or nothing provides docker [...]?
This issue can usually occur in one of the following circumstances:
- Docker is not installed on your machine and/or the official Docker package repository hasn't been set up (see also prerequisites).
- Docker is installed or is about to be upgraded and its version is not supported by NVIDIA Docker (see also supported Docker packages).
- Docker is installed and its version supported, but it isn't the latest version available on the Docker package repository. In this case, package pinning is required (see also not the latest Docker version and older version of Docker).
I'm getting The following signatures were invalid: EXPKEYSIG while trying to install the packages, what do I do?
Make sure you fetched the latest GPG key from the repositories. Refer to the repository instructions for your distribution.
Why do I get the error file /etc/docker/daemon.json from install of nvidia-docker2 conflicts with file from package docker?
You are not using the official docker-ce package, you have Red Hat's fork of Docker. You don't need to install the nvidia-docker2 package, you must follow those instructions instead.
Yes - beta support of the NVIDIA Container Runtime is now available on Jetson platforms (AGX, TX2 and Nano). See this link for more information on getting started.
No, we do not support macOS (regardless of the version), however you can use the native macOS Docker client to deploy your containers remotely (refer to the dockerd documentation).
No, we do not support Microsoft Windows (regardless of the version), however you can use the native Microsoft Windows Docker client to deploy your containers remotely (refer to the dockerd documentation).
No, we do not support native Microsoft container technologies.
Yes, from the CUDA perspective there is no difference as long as your dGPU is powered-on and you are following the official driver instructions.
For your host distribution, the list of supported platforms is available here.
For your container images, both the Docker Hub and NGC registry images are officially supported.
Yes, little-endian only. Check the support matrix for each project:
- https://nvidia.github.io/nvidia-docker
- https://nvidia.github.io/nvidia-container-runtime
- https://nvidia.github.io/libnvidia-container
Notably, if you use docker-ce with CentOS/RHEL on ppc64le, you need to register the nvidia runtime manually. If you are using Red Hat's docker distribution, you can follow the instructions in the README.
We have a tutorial for AWS and a tutorial for Azure.
They haven’t been updated for 2.0 yet but we are working on it and we plan to release a similar tutorial for GCP soon.
Alternatively, you can leverage NGC to deploy optimized container images on AWS and Azure.
No, usually the impact should be in the order of less than 1% and hardly noticeable.
However be aware of the following (list non exhaustive):
- GPU topology and CPU affinity
You can query it usingnvidia-smi topoand use Docker CPU sets to pin CPU cores. - Compiling your code for your device architecture
Your container might be compiled for the wrong achitecture and could fallback to the JIT compilation of PTX code (refer to the official documentation for more information).
Note that you can express these constraints in your container image. - Container I/O overhead
By default Docker containers rely on an overlay filesystem and bridged/NATed networking.
Depending on your workload this can be a bottleneck, we recommend using Docker volumes and experiment with different Docker networks. - Linux kernel accounting and security overhead
In rare cases, you may notice than some kernel subsystems induce overhead.
This will likely depend on your kernel version and can include things like: cgroups, LSMs, seccomp filters, netfilter...
Yes, EGL is supported for headless rendering, but this is a beta feature. There is no plan to support GLX in the near future.
Images are available at nvidia/opengl. If you need CUDA+OpenGL, use nvidia/cudagl.
If you are a NGC subscriber and require GLX for your workflow, please fill out a feature request for support consideration.
Your CUDA container image is incompatible with your driver version.
Upgrade your driver or choose an image tag which is supported by your driver (see also CUDA requirements)
No, MPS is not supported at the moment. However we plan on supporting this feature in the future, and this issue will be updated accordingly.
No, running a X server inside the container is not supported at the moment and there is no plan to support it in the near future (see also OpenGL support).
GPU isolation is achieved through a container environment variable called NVIDIA_VISIBLE_DEVICES.
Devices can be referenced by index (following the PCI bus order) or by UUID (refer to the documentation).
e.g:
# If you have 4 GPUs, to isolate GPUs 3 and 4 (/dev/nvidia2 and /dev/nvidia3)
$ docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=2,3 nvidia/cuda:10.0-base nvidia-smi
nvidia-smi and NVML are not compatible with PID namespaces.
We recommend monitoring your processes on the host or inside a container using --pid=host.
Yes. This is no different than sharing a GPU between multiple processes outside of containers.
Scheduling and compute preemption vary from one GPU architecture to another (e.g. CTA-level, instruction-level).
No. Your only option is to set the GPU clocks at a lower frequency before starting the container.
This is not currently supported but you can enforce it:
- At the container orchestration layer (Kubernetes, Swarm, Mesos, Slurm…) since this is tied to resource allocation.
- At the driver level by setting the compute mode of the GPU.
You probably need to enable persistence mode to keep the kernel modules loaded and the GPUs initialized.
The recommended way is to start the nvidia-persistenced daemon on your host.
If you are running a Docker client inside a container: simply mount the Docker socket and proceed as usual.
If you are running a Docker daemon inside a container: this case is untested.
Your application was probably not compiled for the compute architecture of your GPU and thus the driver must JIT all the CUDA kernels from PTX. In addition to a slow start, the JIT compiler might generate less efficient code than directly targeting your compute architecture (see also performance impact).
No. You would have to handle this manually with Docker volumes.
Your application was not compiled for the compute architecture of your GPU, and no PTX was generated during build time. Thus, JIT compiling is impossible (see also slow to initialize).
Some device management operations require extra privileges (e.g. setting clocks frequency).
After learning about the security implications of doing so, you can add extra capabilities to your container using --cap-add on the command-line (--cap-add=SYS_ADMIN will allow most operations).
Yes but as stated above, you might need extra privileges, meaning extra capabilities like CAP_SYS_PTRACE or tweak the seccomp profile used by Docker to allow certain syscalls.
Yes, we now provide images on DockerHub.
No, Vulkan is not supported at the moment. However we plan on supporting this feature in the future.
Library dependencies vary from one application to another. In order to make things easier for developers, we provide a set of official images to base your images on.
Yes, container images are available on Docker Hub and on the NGC registry.
Yes, as long as you configure your Docker daemon to use the nvidia runtime as the default, you will be able to have build-time GPU support. However, be aware that this can render your images non-portable (see also invalid device function).
Yes, for most cases. The main difference being that we don’t mount all driver libraries by default in 2.0. You might need to set the CUDA_DRIVER_CAPABILITIES environment variable in your Dockerfile or when starting the container. Check the documentation of nvidia-container-runtime.
Use the library stubs provided in /usr/local/cuda/lib64/stubs/. Our official images already take care of setting LIBRARY_PATH.
However, do not set LD_LIBRARY_PATH to this folder, the stubs must not be used at runtime.
The devel image tags are large since the CUDA toolkit ships with many libraries, a compiler and various command-line tools.
As a general rule of thumb, you shouldn’t ship your application with its build-time dependencies. We recommend to use multi-stage builds for this purpose. Your final container image should use our runtime or base images.
As of CUDA 9.0 we now ship a base image tag which bundles the strict minimum of dependencies.
Starting from CUDA 10.0, the CUDA images require using nvidia-docker v2 and won't trigger the GPU enablement path from nvidia-docker v1.
Not currently, support for Swarmkit is still being worked on in the upstream Moby project. You can track our progress here.
Yes, use Compose format 2.3 and add runtime: nvidia to your GPU service. Docker Compose must be version 1.19.0 or higher. You can find an example here.
Since Kubernetes 1.8, the recommended way is to use our official device plugin.