Deploy your containerized AI purposes with nvidia-docker

Increasingly services are making the most of the modeling and prediction capabilities of AI. This text presents the nvidia-docker software for integrating AI (Synthetic Intelligence) software program bricks right into a microservice structure. The principle benefit explored right here is the usage of the host system’s GPU (Graphical Processing Unit) assets to speed up a number of containerized AI purposes.

To grasp the usefulness of nvidia-docker, we’ll begin by describing what sort of AI can profit from GPU acceleration. Secondly we’ll current methods to implement the nvidia-docker software. Lastly, we’ll describe what instruments can be found to make use of GPU acceleration in your purposes and methods to use them.

Why utilizing GPUs in AI purposes?

Within the area of synthetic intelligence, we now have two principal subfields which can be used: machine studying and deep studying. The latter is an element of a bigger household of machine studying strategies primarily based on synthetic neural networks.

Within the context of deep studying, the place operations are primarily matrix multiplications, GPUs are extra environment friendly than CPUs (Central Processing Items). Because of this the usage of GPUs has grown in recent times. Certainly, GPUs are thought-about as the guts of deep studying due to their massively parallel structure.

Nevertheless, GPUs can not execute simply any program. Certainly, they use a particular language (CUDA for NVIDIA) to make the most of their structure. So, methods to use and talk with GPUs out of your purposes?

The NVIDIA CUDA expertise

NVIDIA CUDA (Compute Unified Gadget Structure) is a parallel computing structure mixed with an API for programming GPUs. CUDA interprets utility code into an instruction set that GPUs can execute.

A CUDA SDK and libraries akin to cuBLAS (Primary Linear Algebra Subroutines) and cuDNN (Deep Neural Community) have been developed to speak simply and effectively with a GPU. CUDA is offered in C, C++ and Fortran. There are wrappers for different languages together with Java, Python and R. For instance, deep studying libraries like TensorFlow and Keras are primarily based on these applied sciences.

Why utilizing nvidia-docker?

Nvidia-docker addresses the wants of builders who wish to add AI performance to their purposes, containerize them and deploy them on servers powered by NVIDIA GPUs.

The target is to arrange an structure that permits the event and deployment of deep studying fashions in providers accessible through an API. Thus, the utilization charge of GPU assets is optimized by making them accessible to a number of utility cases.

As well as, we profit from some great benefits of containerized environments:

  • Isolation of cases of every AI mannequin.
  • Colocation of a number of fashions with their particular dependencies.
  • Colocation of the identical mannequin beneath a number of variations.
  • Constant deployment of fashions.
  • Mannequin efficiency monitoring.

Natively, utilizing a GPU in a container requires putting in CUDA within the container and giving privileges to entry the gadget. With this in thoughts, the nvidia-docker software has been developed, permitting NVIDIA GPU gadgets to be uncovered in containers in an remoted and safe method.

On the time of writing this text, the most recent model of nvidia-docker is v2. This model differs enormously from v1 within the following methods:

  • Model 1: Nvidia-docker is applied as an overlay to Docker. That’s, to create the container you had to make use of nvidia-docker (Ex: nvidia-docker run ...) which performs the actions (amongst others the creation of volumes) permitting to see the GPU gadgets within the container.
  • Model 2: The deployment is simplified with the substitute of Docker volumes by way of Docker runtimes. Certainly, to launch a container, it’s now needed to make use of the NVIDIA runtime through Docker (Ex: docker run --runtime nvidia ...)

Word that resulting from their totally different structure, the 2 variations should not suitable. An utility written in v1 should be rewritten for v2.

Establishing nvidia-docker

The required components to make use of nvidia-docker are:

  • A container runtime.
  • An accessible GPU.
  • The NVIDIA Container Toolkit (principal a part of nvidia-docker).



A container runtime is required to run the NVIDIA Container Toolkit. Docker is the really helpful runtime, however Podman and containerd are additionally supported.

The official documentation provides the set up process of Docker.


Drivers are required to make use of a GPU gadget. Within the case of NVIDIA GPUs, the drivers comparable to a given OS could be obtained from the NVIDIA driver obtain web page, by filling within the data on the GPU mannequin.

The set up of the drivers is finished through the executable. For Linux, use the next instructions by changing the title of the downloaded file:

chmod +x

Reboot the host machine on the finish of the set up to take into consideration the put in drivers.

Putting in nvidia-docker

Nvidia-docker is offered on the GitHub undertaking web page. To put in it, observe the set up handbook relying in your server and structure specifics.

We now have an infrastructure that permits us to have remoted environments giving entry to GPU assets. To make use of GPU acceleration in purposes, a number of instruments have been developed by NVIDIA (non-exhaustive listing):

  • CUDA Toolkit: a set of instruments for growing software program/applications that may carry out computations utilizing each CPU, RAM, and GPU. It may be used on x86, Arm and POWER platforms.
  • NVIDIA cuDNN: a library of primitives to speed up deep studying networks and optimize GPU efficiency for main frameworks akin to Tensorflow and Keras.
  • NVIDIA cuBLAS: a library of GPU accelerated linear algebra subroutines.

By utilizing these instruments in utility code, AI and linear algebra duties are accelerated. With the GPUs now seen, the appliance is ready to ship the information and operations to be processed on the GPU.

The CUDA Toolkit is the bottom degree choice. It provides probably the most management (reminiscence and directions) to construct customized purposes. Libraries present an abstraction of CUDA performance. They mean you can concentrate on the appliance improvement somewhat than the CUDA implementation.

As soon as all these components are applied, the structure utilizing the nvidia-docker service is able to use.

Here’s a diagram to summarize all the pieces we now have seen:



We now have arrange an structure permitting the usage of GPU assets from our purposes in remoted environments. To summarize, the structure consists of the next bricks:

  • Working system: Linux, Home windows …
  • Docker: isolation of the setting utilizing Linux containers
  • NVIDIA driver: set up of the motive force for the {hardware} in query
  • NVIDIA container runtime: orchestration of the earlier three
  • Functions on Docker container:
    • CUDA
    • cuDNN
    • cuBLAS
    • Tensorflow/Keras

NVIDIA continues to develop instruments and libraries round AI applied sciences, with the purpose of creating itself as a frontrunner. Different applied sciences could complement nvidia-docker or could also be extra appropriate than nvidia-docker relying on the use case.