Open Source Speech Language Model
Long-form streaming TTS system for multi-speaker dialogue generation
Qwen3-ASR is an open-source series of ASR models
OpenTinker is an RL-as-a-Service infrastructure for foundation models
Block Diffusion for Ultra-Fast Speculative Decoding
Multimodal embedding and reranking models built on Qwen3-VL
Z80-μLM is a 2-bit quantized language model
Implementation of "MobileCLIP" CVPR 2024
Official implementation of Watermark Anything with Localized Messages
Video understanding codebase from FAIR for reproducing video models
Tool for exploring and debugging transformer model behaviors
Multimodal-Driven Architecture for Customized Video Generation
Personalize Any Characters with a Scalable Diffusion Transformer
Project Lyra: Open Generative 3D World Models
Achieving 3+ generation speedup on reasoning tasks
Ultra-Efficient LLMs on End Device
General-purpose image editing model that delivers high-fidelity
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
Easy Docker setup for Stable Diffusion with user-friendly UI
Inference script for Oasis 500M
Fast and Universal 3D reconstruction model for versatile tasks
4M: Massively Multimodal Masked Modeling
Foundation Models for Time Series
FAIR Sequence Modeling Toolkit 2
ICLR2024 Spotlight: curation/training code, metadata, distribution