GLM-4 series: Open Multilingual Multimodal Chat LMs
Python inference and LoRA trainer package for the LTX-2 audio–video
ChatGLM-6B: An Open Bilingual Dialogue Language Model
Recovering the Visual Space from Any Views
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
Global weather forecasting model using graph neural networks and JAX
Inference code for scalable emulation of protein equilibrium ensembles
Programmatic access to the AlphaGenome model
Large Multimodal Models for Video Understanding and Editing
Ling is a MoE LLM provided and open-sourced by InclusionAI
Accurate × Fast × Comprehensive
Visual Causal Flow
CLIP, Predict the most relevant text snippet given an image
Repo for SeedVR2 & SeedVR
MOSS‑TTS Family open‑source speech and sound generation model
Open-source image generative foundation model
4M: Massively Multimodal Masked Modeling
An experimental version of DeepSeek model
Designed for text embedding and ranking tasks
High-Fidelity and Controllable Generation of Textured 3D Assets
Implementation of the Surya Foundation Model for Heliophysics
A Powerful Native Multimodal Model for Image Generation
Repo of Qwen2-Audio chat & pretrained large audio language model
FAIR Sequence Modeling Toolkit 2
code for Mesh R-CNN, ICCV 2019