Official inference repo for FLUX.2 models
Native and Compact Structured Latents for 3D Generation
Qwen3-omni is a natively end-to-end, omni-modal LLM
gpt-oss-120b and gpt-oss-20b are two open-weight language models
Multimodal Diffusion with Representation Alignment
A theoretical reconstruction of the Claude Mythos architecture
An easy 1-click way to create beautiful artwork on your PC using AI
Renderer for the harmony response format to be used with gpt-oss
Long-form streaming TTS system for multi-speaker dialogue generation
Qwen3-TTS is an open-source series of TTS models
Qwen3-Coder is the code version of Qwen3
GLM-4-Voice | End-to-End Chinese-English Conversational Model
OCR expert VLM powered by Hunyuan's native multimodal architecture
Controllable & emotion-expressive zero-shot TTS
An experimental version of DeepSeek model
AlphaFold 3 inference pipeline
Industrial-level controllable zero-shot text-to-speech system
Contexts Optical Compression
Advancing Open-source World Models
Multi-modal large language model designed for audio understanding
Open-source framework for intelligent speech interaction
Easy Docker setup for Stable Diffusion with user-friendly UI
Qwen3-ASR is an open-source series of ASR models
A Multi-Modal World Model for Reconstructing, Generating, Simulation
Provides convenient access to the Anthropic REST API from any Python 3