A fast TTS architecture with conditional flow matching
Code for running inference with the SAM 3D Body Model 3DB
Sharp Monocular Metric Depth in Less Than a Second
A lightweight, powerful framework for multi-agent workflows
Make websites accessible for AI agents
State-of-the-art diffusion models for image and audio generation
Making Enterprise Data Intelligent and Responsive for AI
AI Toolkit for Healthcare Imaging
A robust, efficient, low-latency speech-to-text library
A full spaCy pipeline and models for scientific/biomedical documents
Single-cell analysis in Python
Generate high-definition story short videos with one click using AI
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Converts text to speech in realtime
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Agent framework and applications built upon Qwen>=3.0
Full stack AI software engineer
Optimizing inference proxy for LLMs
TextWorld is a sandbox learning environment for the training
Solve end to end problems using Llama model family
Easiest and laziest way for building multi-agent LLMs applications
Automatically translates the text of a video based on a subtitle file
Benchmarking Multimodal Agents for Open-Ended Tasks
Parse files for optimal RAG
An Async Bot/API wrapper for Twitch made in Python