Technology
Google Chirp 3
Google Chirp 3 is a high-definition generative speech model designed for lifelike, emotionally resonant text-to-speech and multilingual transcription.
Chirp 3 represents the latest evolution in Google's Universal Speech Model (USM) architecture, delivering human-like intonation and emotional nuance across 31 languages and 248 distinct voices. This generative AI model excels in both directions: it transcribes complex, multilingual audio with high accuracy using self-supervised learning from millions of hours of data, and it synthesizes speech that incorporates natural disfluencies and realistic cadence. Developers can leverage Chirp 3 via the Vertex AI platform for low-latency streaming, speaker diarization, and even instant voice cloning using as little as 10 seconds of reference audio.
Related technologies
Recent Talks & Demos
Showing 1-1 of 1