Official inference repo for FLUX.1 models
GLM-5: From Vibe Coding to Agentic Engineering
Official Python inference and LoRA trainer package
Fast-stable-diffusion + DreamBooth
The most powerful local music generation model
Contexts Optical Compression
MiniMax M2.1, a SOTA model for real-world dev & agents.
One-click local MCP server installation in desktop apps
Convert Google Gemini web into OpenAI-compatible API
Python SDK for Claude Agent
Open-source, high-performance AI model with advanced reasoning
Miso TTS is an 8 billion, highly emotive text-to-speech model
Official inference repo for FLUX.2 models
Qwen3 is the large language model series developed by Qwen team
HY-Motion model for 3D character animation generation
Diversity-driven optimization and large-model reasoning ability
Qwen3.6 is the large language model series developed by Qwen team
Fast, Sharp & Reliable Agentic Intelligence
MiMo-V2-Flash: Efficient Reasoning, Coding, and Agentic Foundation
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
Model export recipes, Python primitives, and Swift runtime utilities
Inference script for Oasis 500M
Qwen2.5-VL is the multimodal large language model series
26m function call model that runs on incredibly small devices
C#/.NET binding of llama.cpp, including LLaMa/GPT model inference