Technology
Whisper Small
OpenAI's 244M parameter transformer model engineered for high-speed, multilingual speech-to-text transcription.
Whisper Small hits the production sweet spot: 244M parameters of multilingual speech-to-text power. It processes 30 second audio chunks through an encoder-decoder transformer architecture, supporting 99 languages out of the box. While the Large-v3 model (1.55B parameters) offers higher precision, the Small variant maintains a competitive 10.6 Word Error Rate (WER) on LibriSpeech clean data. It is the primary choice for developers needing low-latency transcription on standard hardware (CPUs or mid-range GPUs).
Recent Talks & Demos
Showing 1-0 of 0