Fast State-of-the-Art Static Embeddings
Faster Whisper transcription with CTranslate2
Qwen3-TTS is an open-source series of TTS models
Automatic Speech Recognition with Word-level Timestamps
A game theoretic approach to explain the output of ml models
A text-to-speech, speech-to-text and speech-to-speech library
The most powerful local music generation model
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
Industrial-strength Natural Language Processing (NLP)
Ultralytics YOLO
Local long-term memory engine for AI apps with persistent storage
Achieving 3+ generation speedup on reasoning tasks
YOLOv5 is the world's most loved vision AI
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Extensible, parallel implementations of t-SNE
A GUI tool for extracting hard-coded subtitle (hardsub) from videos
TFDS is a collection of datasets ready to use with TensorFlow,
100–200× Acceleration for Video Diffusion Models
Fast and Universal 3D reconstruction model for versatile tasks
A Web UI for easy subtitle using whisper model
Deep learning optimization library: makes distributed training easy
An Open Source text-to-speech system built by inverting Whisper
Opensource browser using agents
A guidance language for controlling large language models
Lets make video diffusion practical