Technology
Whisper Small — On-device speech-to-text transcription
Whisper Small provides a high-performance balance of speed and accuracy for local speech-to-text tasks using 244 million parameters.
This Transformer-based encoder-decoder model handles multilingual transcription and translation directly on local hardware. Trained on 680,000 hours of audio data, it supports 99 languages and runs efficiently with 2 GB of VRAM. It clocks in at 4x the speed of the Large model: an ideal choice for low-latency applications on edge devices. Use it to maintain data privacy and cut latency without sacrificing the robustness needed for noisy real-world environments.
Recent Talks & Demos
Showing 1-0 of 0