Open-source, high-performance AI model with advanced reasoning
Powerful AI language model (MoE) optimized for efficiency/performance
State-of-the-art TTS model under 25MB
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Awesome multilingual OCR toolkits based on PaddlePaddle
Open-source multi-speaker long-form text-to-speech model
Visual Causal Flow
AlphaFold 3 inference pipeline
Industrial-level controllable zero-shot text-to-speech system
From Images to High-Fidelity 3D Assets
Python SDK for Claude Agent
RGBD video generation model conditioned on camera input
Open Source Speech Language Model
Contexts Optical Compression
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
Video understanding codebase from FAIR for reproducing video models
Qwen3-ASR is an open-source series of ASR models
Long-form streaming TTS system for multi-speaker dialogue generation
General-purpose image editing model that delivers high-fidelity
FAIR Sequence Modeling Toolkit 2
Pushing the Limits of Mathematical Reasoning in Open Language Models
Audio foundation model excelling in audio understanding
Controllable & emotion-expressive zero-shot TTS
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
Official implementation of Watermark Anything with Localized Messages