Technology
Mamba-SSM
Mamba is a linear-time sequence model that achieves Transformer-level performance through selective state space modeling.
Mamba-SSM replaces the quadratic bottleneck of standard Transformers with a hardware-aware Selective State Space Model (S6). By making the SSM parameters input-dependent, it filters irrelevant data while maintaining a compressed state. This architecture delivers 5x higher throughput than Llama-7B and scales linearly with sequence length (O(N)). It handles million-token contexts on standard A100 GPUs without the memory decay typical of traditional attention mechanisms.
Recent Talks & Demos
Showing 1-0 of 0