language-identification - marc.schuetze.io

What it does

A real-time spoken language identification system developed at ZKM. Listens to audio input and classifies which language is being spoken, outputting results over OSC for integration with interactive installations.

Key features

Real-time classification — continuous audio stream analysis with low latency
Multi-language support — trained on a wide range of languages
OSC output — sends classification results via Open Sound Control for easy integration with media frameworks
Configurable — adjustable confidence thresholds, buffer sizes, and output rates

Tech stack

Python with PyTorch for the neural network model, sounddevice for audio capture, and python-osc for OSC output. Includes training scripts and pre-trained model weights.

Museum context

Originally developed for interactive exhibitions at ZKM where visitor language needed to be detected to trigger language-appropriate content.