| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2026-06-04 | 1.7 kB | |
| v3.6.0_ Audio Anomaly Detection Modality source code.tar.gz | 2026-06-04 | 9.4 MB | |
| v3.6.0_ Audio Anomaly Detection Modality source code.zip | 2026-06-04 | 9.7 MB | |
| Totals: 3 Items | 19.0 MB | 0 | |
v3.6.0: Audio Anomaly Detection Modality
Audio joins tabular, time-series, graph, text, and image as a first-class PyOD modality on the agentic and multimodal line. The additions are encoder-agnostic and additive, with no change to existing detectors.
New
AudioFeatureEncoder: each clip becomes a 74-dim handcrafted acoustic vector (20 MFCC, 12 chroma, 5 spectral descriptors, each as mean and standard deviation over frames, via librosa). Registered as theaudio-mfccencoder.EmbeddingOD.for_audio(quality=...):fast=IForest,balanced=KNN,best=LUNAR over the audio encoder, so any classical detector runs on audio (embed then detect).AudioAE: DCASE-style log-mel reconstruction autoencoder that reuses the PyODAutoEncoder, scored by per-clip mean reconstruction error. Requirestorch.- ADEngine: audio file-path profiling and routing (
for_audioas the default,AudioAEas the deep alternative); knowledge-base entries forAudioAEand audio support onEmbeddingODandMultiModalOD. pip install pyod[audio]: new optional extra (librosa, soundfile).
Counts
Buildable detector count rises from 60 to 61: 61 total (43 tabular, 7 time-series, 8 graph, 2 text, 2 image, 1 multimodal, 3 audio).
Install
pip install --upgrade pyod # core
pip install "pyod[audio]" # audio encoder (librosa, soundfile)
pip install "pyod[torch,audio]" # AudioAE (deep)
References the public methods (the DCASE 2020 Task 2 log-mel autoencoder baseline, and MFCC, chroma, and spectral features via librosa). No breaking API changes.
Full changelog: https://github.com/yzhao062/pyod/compare/v3.5.4...v3.6.0