The most reliable AI agent framework that supports MCP
Qwen3-TTS is an open-source series of TTS models
lightweight package to simplify LLM API calls
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
code for Mesh R-CNN, ICCV 2019
Inference script for Oasis 500M
State-of-the-art (SoTA) text-to-video pre-trained model
AIMET is a library that provides advanced quantization and compression
Tool for exploring and debugging transformer model behaviors
Designed for text embedding and ranking tasks
RGBD video generation model conditioned on camera input
950 line, minimal, extensible LLM inference engine built from scratch
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
GUI Exploration Lab. One of the best GUI agent solutions
Multimodal Diffusion with Representation Alignment
Minimal Claude Code alternative. Single Python file, zero dependencies
Helping you get the most out of AWS, wherever you use MCP
A high-quality rapid TTS voice cloning model
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
OCR expert VLM powered by Hunyuan's native multimodal architecture
gpt-oss-120b and gpt-oss-20b are two open-weight language models
Foundational model for human-like, expressive TTS
Reverse-engineered Python API for Google Gemini web app
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI
A Universal Customization Method for Single and Multi Conditioning