Open-source, high-performance AI model with advanced reasoning
Powerful AI language model (MoE) optimized for efficiency/performance
State-of-the-art TTS model under 25MB
Awesome multilingual OCR toolkits based on PaddlePaddle
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Video Object and Interaction Deletion
Visual Causal Flow
RGBD video generation model conditioned on camera input
Pokee Deep Research Model Open Source Repo
Open Source Speech Language Model
Open-source multi-speaker long-form text-to-speech model
AlphaFold 3 inference pipeline
Industrial-level controllable zero-shot text-to-speech system
FAIR Sequence Modeling Toolkit 2
Contexts Optical Compression
From Images to High-Fidelity 3D Assets
Python SDK for Claude Agent
Qwen3-ASR is an open-source series of ASR models
A trainable PyTorch reproduction of AlphaFold 3
Video understanding codebase from FAIR for reproducing video models
ChatGPT interface with better UI
Long-form streaming TTS system for multi-speaker dialogue generation
Controllable & emotion-expressive zero-shot TTS
Audio foundation model excelling in audio understanding
Large Multimodal Models for Video Understanding and Editing