Fast State-of-the-Art Static Embeddings
Qwen3-TTS is an open-source series of TTS models
HunyuanVideo: A Systematic Framework For Large Video Generation Model
A text-to-speech, speech-to-text and speech-to-speech library
Local long-term memory engine for AI apps with persistent storage
YOLOv5 is the world's most loved vision AI
The most powerful local music generation model
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
100–200× Acceleration for Video Diffusion Models
Simulation framework for accelerating research
A GUI tool for extracting hard-coded subtitle (hardsub) from videos
A game theoretic approach to explain the output of ml models
A high-quality rapid TTS voice cloning model
LightLLM is a Python-based LLM (Large Language Model) inference
Fast and Universal 3D reconstruction model for versatile tasks
Lets make video diffusion practical
Sparsity-aware deep learning inference runtime for CPUs
Deep learning optimization library: makes distributed training easy
A guidance language for controlling large language models
Library for OCR-related tasks powered by Deep Learning
Easily compute clip embeddings and build a clip retrieval system
This repo contains the code for 1D tokenizer and generator
An Open Source text-to-speech system built by inverting Whisper
Uncover insights, surface problems, monitor, and fine tune your LLM
Industrial-level controllable zero-shot text-to-speech system