Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
Controllable and fast Text-to-Speech for over 7000 languages
Towards Human-Level Text-to-Speech through Style Diffusion
AI discovers 520000 stable inorganic crystal structures for research
DeepMind model for tracking arbitrary points across videos & robotics
Sharp Monocular Metric Depth in Less Than a Second
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Inference framework for 1-bit LLMs
Library for training machine learning models with privacy for data
Implementation of Video Diffusion Models
Tool for visualizing and tracking your machine learning experiments
Train machine learning models within Docker containers
Build AI-powered semantic search applications
AnyTool: Universal Tool-Use Layer for AI Agents
Motion-controllable Video Generation via Latent Trajectory Guidance
The knowledge and task management backbone for AI coding assistants
"Big Model" trains a visual multimodal VLM with 26M parameters
Ling is a MoE LLM provided and open-sourced by InclusionAI
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible
LLM based autonomous agent that does online comprehensive research
Superfast AI decision making and processing of multi-modal data
Uncover insights, surface problems, monitor, and fine tune your LLM
Bailing is a voice dialogue robot similar to GPT-4o
A minimal yet professional single agent demo project
Real-time voice interactive digital human