Mamba-2 Projects .

Technology

Mamba-2

Mamba-2 evolves the State Space Model architecture with SSD, achieving 2-8x faster training speeds than its predecessor while outperforming Transformers on long-sequence tasks.

Mamba-2 introduces State Space Duality (SSD), a theoretical framework that bridges the gap between structured state space models and the attention mechanism. By reformulating the core algorithm as a block-semiseparable matrix multiplication, the architecture leverages NVIDIA Tensor Cores to reach 50% higher throughput than standard Mamba. This iteration maintains linear scaling (O(N)) for 1M+ token contexts while matching or exceeding Llama-3 performance benchmarks in language modeling. It is a drop-in efficiency upgrade for researchers pushing the limits of long-form synthesis and high-speed inference.

https://github.com/state-spaces/mamba
1 project · 1 city

Related technologies

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

Sign in to see who built these projects