From Images to High-Fidelity 3D Assets
Python inference and LoRA trainer package for the LTX-2 audio–video
Awesome multilingual OCR toolkits based on PaddlePaddle
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
CodeGeeX2: A More Powerful Multilingual Code Generation Model
Netease Youdao's open-source embedding and reranker models
Fast stable diffusion on CPU and AI PC
A Family of Open Sourced Music Foundation Models
Long-form streaming TTS system for multi-speaker dialogue generation
A Multi-Modal World Model for Reconstructing, Generating, Simulation
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
The Clay Foundation Model - An open source AI model and interface
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
A Pragmatic VLA Foundation Model
Multimodal embedding and reranking models built on Qwen3-VL
High-resolution models for human tasks
Open-Source Financial Large Language Models
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
A Systematic Framework for Interactive World Modeling
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
Open Source Speech Language Model
Ling is a MoE LLM provided and open-sourced by InclusionAI
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1