A 0.1B Omni model trained from scratch
OpenTinker is an RL-as-a-Service infrastructure for foundation models
Block Diffusion for Ultra-Fast Speculative Decoding
Multimodal embedding and reranking models built on Qwen3-VL
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
Pokee Deep Research Model Open Source Repo
Open-weight, large-scale hybrid-attention reasoning model
Foundation Models for Time Series
Visual Causal Flow
Long-form streaming TTS system for multi-speaker dialogue generation
Generate Any 3D Scene in Seconds
This repository contains the official implementation of FastVLM
Memory-efficient and performant finetuning of Mistral's models
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI
Research code artifacts for Code World Model (CWM)
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Fast-stable-diffusion + DreamBooth
Hunyuan Translation Model Version 1.5
New family of code large language models (LLMs)
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
High-resolution models for human tasks
Video understanding codebase from FAIR for reproducing video models
CLIP, Predict the most relevant text snippet given an image
Repo of Qwen2-Audio chat & pretrained large audio language model
Achieving 3+ generation speedup on reasoning tasks