Inference code for scalable emulation of protein equilibrium ensembles
Programmatic access to the AlphaGenome model
A SOTA open-source image editing model
A state-of-the-art open visual language model
Chinese and English multimodal conversational language model
Tongyi Deep Research, the Leading Open-source Deep Research Agent
MOSS‑TTS Family open‑source speech and sound generation model
Video Object and Interaction Deletion
Open Source Speech Language Model
Long-form streaming TTS system for multi-speaker dialogue generation
Open-source industrial-grade ASR models
A Pragmatic VLA Foundation Model
Hunyuan Translation Model Version 1.5
Block Diffusion for Ultra-Fast Speculative Decoding
Multimodal embedding and reranking models built on Qwen3-VL
Implementation of "MobileCLIP" CVPR 2024
VMZ: Model Zoo for Video Modeling
High-resolution models for human tasks
Video understanding codebase from FAIR for reproducing video models
CLIP, Predict the most relevant text snippet given an image
Ling is a MoE LLM provided and open-sourced by InclusionAI
A Unified Framework for Text-to-3D and Image-to-3D Generation
Personalize Any Characters with a Scalable Diffusion Transformer
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
High-Fidelity and Controllable Generation of Textured 3D Assets