| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2026-04-28 | 4.9 kB | |
| v0.4.3 source code.tar.gz | 2026-04-28 | 6.1 MB | |
| v0.4.3 source code.zip | 2026-04-28 | 6.5 MB | |
| Totals: 3 Items | 12.6 MB | 0 | |
What's Changed
- Add LongCat-AudioDiT 1B TTS model by @Blaizzy in https://github.com/Blaizzy/mlx-audio/pull/627
- feat: add WebM audio format support by @regcs in https://github.com/Blaizzy/mlx-audio/pull/635
- Add MkDocs docs site and docs guardrails by @shreyaskarnik in https://github.com/Blaizzy/mlx-audio/pull/626
- Update branch for GitHub Actions workflow by @Blaizzy in https://github.com/Blaizzy/mlx-audio/pull/639
- feat: add MeloTTS-English MLX port by @shreyaskarnik in https://github.com/Blaizzy/mlx-audio/pull/629
- feat: add OmniVoice zero-shot multilingual TTS (646+ languages) by @beshkenadze in https://github.com/Blaizzy/mlx-audio/pull/630
- Register client disconnects while streaming TTS audio. by @orbitalquark in https://github.com/Blaizzy/mlx-audio/pull/634
- fix(kokoro): support quantized checkpoint layout and guard NaN durations by @beshkenadze in https://github.com/Blaizzy/mlx-audio/pull/624
- Remove docs check for user-facing changes by @Blaizzy in https://github.com/Blaizzy/mlx-audio/pull/658
- fix(stt): correct granite_speech Conv1d weight sanitization and add parakeet model_type by @ryancee in https://github.com/Blaizzy/mlx-audio/pull/657
- fix(cohere): restore quantized inference for 8-bit and 4-bit checkpoints by @beshkenadze in https://github.com/Blaizzy/mlx-audio/pull/650
- feat(irodori-tts): add v2 model support with VoiceDesign and chunked DACVAE decode by @yoshphys in https://github.com/Blaizzy/mlx-audio/pull/660
- feat: add Higgs Audio v2 — 3B Llama-backed TTS with voice cloning by @Kairos-a in https://github.com/Blaizzy/mlx-audio/pull/656
- Remove librosa dependency by @lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/662
- Replace all soundfile calls with core equivalents by @lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/663
- Move misaki to an optional install to reduce dependency graph by @lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/664
- Improve performance of Parakeet TDT on longform content by @lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/665
- Fix Voxtral Realtime streaming and speed up the 4-bit path by ~3x by @iris-sfg in https://github.com/Blaizzy/mlx-audio/pull/661
- feat(higgs_audio): add ReferenceContext for reusable encoded-reference state by @Kairos-a in https://github.com/Blaizzy/mlx-audio/pull/666
- Fix Voxtral TTS tokenizer dependency contract by @lyonsno in https://github.com/Blaizzy/mlx-audio/pull/633
- Remove pyloudnorm dependency by @lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/667
- Support concurrent requests to the server by @lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/668
- Add a standard model loading path for STS models by @lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/670
- Remove pydub dependency by @lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/671
- Clean up bare scipy usage by @lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/672
- Remove explicit tiktoken dependency by @lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/673
- docs: add Svara TTS (multilingual Indic) entry by @shreyaskarnik in https://github.com/Blaizzy/mlx-audio/pull/678
- Fix Voxtral STT crash on eos_token_ids initialization by @contrapuntal in https://github.com/Blaizzy/mlx-audio/pull/677
- feat: add Mel-Band-RoFormer architecture for vocal source separation by @xocialize in https://github.com/Blaizzy/mlx-audio/pull/654
- Improved dep handling for mlx-lm by @lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/683
- Add MOSS-TTS-Nano by @lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/676
- docs: add shields.io badges and table of contents to README by @Gingiris in https://github.com/Blaizzy/mlx-audio/pull/680
- Adjust Trendshift badge to README by @Blaizzy in https://github.com/Blaizzy/mlx-audio/pull/684
- Add batching support for Fish Speech S2 Pro by @lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/675
- Add continuous batching support for Qwen3 TTS to the server by @lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/674
New Contributors
- @regcs made their first contribution in https://github.com/Blaizzy/mlx-audio/pull/635
- @ryancee made their first contribution in https://github.com/Blaizzy/mlx-audio/pull/657
- @Kairos-a made their first contribution in https://github.com/Blaizzy/mlx-audio/pull/656
- @iris-sfg made their first contribution in https://github.com/Blaizzy/mlx-audio/pull/661
- @lyonsno made their first contribution in https://github.com/Blaizzy/mlx-audio/pull/633
- @contrapuntal made their first contribution in https://github.com/Blaizzy/mlx-audio/pull/677
- @xocialize made their first contribution in https://github.com/Blaizzy/mlx-audio/pull/654
- @Gingiris made their first contribution in https://github.com/Blaizzy/mlx-audio/pull/680
Full Changelog: https://github.com/Blaizzy/mlx-audio/compare/v0.4.2...v0.4.3