Technology
YAMNet
A deep neural network that classifies 521 audio events using the AudioSet ontology and a MobileNetV1 architecture.
Google Research built YAMNet to handle efficient, real-time sound recognition. It maps raw audio waveforms to 521 specific categories (ranging from power tools to bird calls) defined by the AudioSet hierarchy. The architecture uses MobileNetV1 depthwise separable convolutions to keep the model lightweight (roughly 3.7 million parameters). Deploy it via TensorFlow Lite for edge computing or mobile applications. It generates predictions every 0.48 seconds: this ensures high-resolution tracking for dynamic acoustic environments.
Recent Talks & Demos
Showing 1-0 of 0