Large-language-model & vision-language-model based on Linear Attention
The most accurate natural language detection library for Python
Fast stable diffusion on CPU and AI PC
A full spaCy pipeline and models for scientific/biomedical documents
Open speech-to-speech models and pipelines by Hugging Face toolkit AI
Parse files for optimal RAG
Open source healthcare AI
Long-form streaming TTS system for multi-speaker dialogue generation
Marrying Grounding DINO with Segment Anything & Stable Diffusion
Audiocraft is a library for audio processing and generation
A Model Context Protocol (MCP) server
TextWorld is a sandbox learning environment for the training
LLM
Interface for OuteTTS models
MARS5 speech model (TTS) from CAMB.AI
Automatically translates the text of a video based on a subtitle file
Bidirectional token-classification model for identifiable info
Search all of YouTube from the command line
HY-Motion model for 3D character animation generation
Scalable data pre processing and curation toolkit for LLMs
Multi-lingual large voice generation model, providing inference
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
lightweight package to simplify LLM API calls
Open-Sora: Democratizing Efficient Video Production for All