ZKM Python 42

language-identification

Spoken language identification using deep learning — detect which language is being spoken in real time.

deep-learningaudionlposcmuseum

What it does

A real-time spoken language identification system developed at ZKM. Listens to audio input and classifies which language is being spoken, outputting results over OSC for integration with interactive installations.

Key features

  • Real-time classification — continuous audio stream analysis with low latency
  • Multi-language support — trained on a wide range of languages
  • OSC output — sends classification results via Open Sound Control for easy integration with media frameworks
  • Configurable — adjustable confidence thresholds, buffer sizes, and output rates

Tech stack

Python with PyTorch for the neural network model, sounddevice for audio capture, and python-osc for OSC output. Includes training scripts and pre-trained model weights.

Museum context

Originally developed for interactive exhibitions at ZKM where visitor language needed to be detected to trigger language-appropriate content.