Technology

Mamba-2

Mamba-2 evolves the State Space Model architecture with SSD, achieving 2-8x faster training speeds than its predecessor while outperforming Transformers on long-sequence tasks.

Mamba-2 introduces State Space Duality (SSD), a theoretical framework that bridges the gap between structured state space models and the attention mechanism. By reformulating the core algorithm as a block-semiseparable matrix multiplication, the architecture leverages NVIDIA Tensor Cores to reach 50% higher throughput than standard Mamba. This iteration maintains linear scaling (O(N)) for 1M+ token contexts while matching or exceeding Llama-3 performance benchmarks in language modeling. It is a drop-in efficiency upgrade for researchers pushing the limits of long-form synthesis and high-speed inference.

https://github.com/state-spaces/mamba

1 project · 1 city

Related technologies

BLIP-2 Q-Former 1 Gemma 3n E4B 1 MediaPipe FaceLandmarker 1 PyTorch 263

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

Speak mk1: Hybrid Speech Therapy

Dubai May 23