NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
Phi-3.5 for Mac: Locally-run Vision and Language Models
CodeGeeX2: A More Powerful Multilingual Code Generation Model
Qwen3-Coder is the code version of Qwen3
Qwen-Image is a powerful image generation foundation model
An experimental version of DeepSeek model
The official repo of Qwen chat & pretrained large language model
Memory-efficient and performant finetuning of Mistral's models
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
gpt-oss-120b and gpt-oss-20b are two open-weight language models
Collection of Gemma 3 variants that are trained for performance
Industrial-level controllable zero-shot text-to-speech system
Tool for exploring and debugging transformer model behaviors
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Recovering the Visual Space from Any Views
Ultra-Efficient LLMs on End Device
Open-source framework for intelligent speech interaction
Official implementation of DreamCraft3D
High-resolution models for human tasks
Provides convenient access to the Anthropic REST API from any Python 3
Capable of understanding text, audio, vision, video
Generating Immersive, Explorable, and Interactive 3D Worlds
Chinese and English multimodal conversational language model
Achieving 3+ generation speedup on reasoning tasks
Diffusion Transformer with Fine-Grained Chinese Understanding