Optimized primitives for collective multi-GPU communication.
Introduction
NCCL (pronounced "Nickel") is a stand-alone library of standard
collective communication routines for GPUs, implementing all-reduce,
all-gather, reduce, broadcast, and reduce-scatter. It has been optimized
to achieve high bandwidth on platforms using PCIe, NVLink, NVswitch, as
well as networking using InfiniBand Verbs or TCP/IP sockets. NCCL
supports an arbitrary number of GPUs installed in a single node or
across multiple nodes, and can be used in either single- or
multi-process (e.g., MPI) applications.
For more information on NCCL usage, please refer to the NCCL documentation.
What's inside
At present, the library implements the following collectives operations:
all-reduce
all-gather
reduce-scatter
reduce
broadcast
These operations are implemented using ring algorithms and have been
optimized for throughput and latency. For best performance, small
operations can be either batched into larger operations or aggregated
through the API.
Requirements
NCCL requires at least CUDA 7.0 and Kepler or newer GPUs. For PCIe
based platforms, best performance is achieved when all GPUs are located
on a common PCIe root complex, but multi-socket configurations are also
supported.
Build
Note: the official and tested builds of NCCL can be downloaded from: https://developer.nvidia.com/nccl. You can skip the following build steps if you choose to use the official builds.
To build the library :
$ cd nccl
$ make -j src.build
If CUDA is not installed in the default /usr/local/cuda path, you can define the CUDA path with :
$ make src.build CUDA_HOME=<path to cuda install>
NCCL will be compiled and installed in build/ unless BUILDDIR is set.
By default, NCCL is compiled for all supported architectures. To
accelerate the compilation and reduce the binary size, consider
redefining NVCC_GENCODE (defined in makefiles/common.mk) to only include the architecture of the target platform :
$ make -j src.build NVCC_GENCODE="-gencode=arch=compute_70,code=sm_70"
Install
To install NCCL on the system, create a package then install it as root.
Debian/Ubuntu :
$ # Install tools to create debian packages$ sudo apt install build-essential devscripts debhelper fakeroot
$ # Build NCCL deb package$ make pkg.debian.build
$ ls build/pkg/deb/
RedHat/CentOS :
$ # Install tools to create rpm packages$ sudo yum install rpm-build rpmdevtools
$ # Build NCCL rpm package$ make pkg.redhat.build
$ ls build/pkg/rpm/