RGBD video generation model conditioned on camera input
New family of code large language models (LLMs)
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Qwen3-omni is a natively end-to-end, omni-modal LLM
Capable of understanding text, audio, vision, video
Official Python inference and LoRA trainer package
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Open-source, high-performance AI model with advanced reasoning
Awesome multilingual OCR toolkits based on PaddlePaddle
A state-of-the-art open visual language model
Lets make video diffusion practical
The most powerful local music generation model
Powerful AI language model (MoE) optimized for efficiency/performance
Official inference repo for FLUX.1 models
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Advanced language and coding AI model
Agentic, Reasoning, and Coding (ARC) foundation models
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
A theoretical reconstruction of the Claude Mythos architecture
Code for running inference and finetuning with SAM 3 model
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
A Family of Open Sourced Music Foundation Models
From Images to High-Fidelity 3D Assets