95% token savings. 155x faster queries. 16 languages
End-to-end speech processing toolkit
Audiocraft is a library for audio processing and generation
Open source NLP guide with models, methods, and real use cases
Stanford NLP Python library for many human languages
General-purpose image editing model that delivers high-fidelity
Structured data extraction and instruction calling with ML, LLM
Using AI models to automatically provide commentary and edit videos
Framework for building realtime multimodal voice AI agents apps
Running large language models on a single GPU
Framework for building real-time voice and multimodal AI agents
Open Source Speech Language Model
Advanced AI Explainability for computer vision
Controllable and fast Text-to-Speech for over 7000 languages
LLM-based agent for general purpose software engineering tasks
HivisionIDPhotos: a lightweight and efficient AI ID photos tools
Edit videos with Claude Code
HunyuanVideo: A Systematic Framework For Large Video Generation Model
OCR expert VLM powered by Hunyuan's native multimodal architecture
Audio foundation model excelling in audio understanding
LLM Large Model of Selling Anchor
Open source AI VTuber platform with voice chat and Live2D avatars
A Personalized LLM-powered Agent Frameworks
Official implementation of DreamCraft3D
Generative AI reference workflows