Technology
Whisper Turbo
Whisper Turbo is the high-speed, pruned variant of OpenAI's Whisper ASR model, engineered for up to 8x faster transcription inference with minimal accuracy impact.
This is Whisper Large-v3-Turbo: a distilled, high-efficiency model for Automatic Speech Recognition (ASR). We pruned the original Whisper Large-v3 architecture, specifically reducing the decoder layers from 32 to just 4. This optimization delivers a massive speed boost—up to an 8x increase in inference time—making it ideal for production environments demanding rapid, high-volume transcription. The model maintains strong multilingual ASR performance, trading a negligible amount of accuracy for a significant gain in operational speed and on-device deployment capability (e.g., requiring only 1.6 GB in float16 precision).
Related technologies
Recent Talks & Demos
Showing 1-1 of 1