CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
CodeGeeX2: A More Powerful Multilingual Code Generation Model
Tongyi Deep Research, the Leading Open-source Deep Research Agent
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Video understanding codebase from FAIR for reproducing video models
Controllable & emotion-expressive zero-shot TTS
Foundation Models for Time Series
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
Capable of understanding text, audio, vision, video
Real-time behaviour synthesis with MuJoCo, using Predictive Control
PyTorch code and models for the DINOv2 self-supervised learning
Memory-efficient and performant finetuning of Mistral's models
Chat & pretrained large vision language model
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Tiny vision language model
The official PyTorch implementation of Google's Gemma models
GLM-4 series: Open Multilingual Multimodal Chat LMs
Qwen3-omni is a natively end-to-end, omni-modal LLM
A state-of-the-art open visual language model
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
4M: Massively Multimodal Masked Modeling
Official DeiT repository
GLM-4-Voice | End-to-End Chinese-English Conversational Model
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
Pokee Deep Research Model Open Source Repo