Technology
cuDNN
NVIDIA's GPU-accelerated library provides optimized primitives for deep learning operations like convolution, pooling, and normalization.
cuDNN (CUDA Deep Neural Network library) provides the high-performance primitives required by frameworks like PyTorch, TensorFlow, and JAX. It automates the optimization of routines for 2D/3D convolutions, batch normalization, and activation layers. By utilizing NVIDIA Tensor Cores, the library maximizes throughput for architectures ranging from Ampere to Blackwell. This integration allows researchers to focus on model design while cuDNN handles the low-level hardware tuning and memory layout (NCHW vs. NHWC) for peak efficiency.
Recent Talks & Demos
Showing 1-0 of 0