Python inference and LoRA trainer package for the LTX-2 audio–video
Official Python inference and LoRA trainer package
A Powerful Native Multimodal Model for Image Generation
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Qwen3.5 is the large language model series developed by Qwen team
A 0.1B Omni model trained from scratch
Official inference repo for FLUX.2 models
New family of code large language models (LLMs)
Open-weight, large-scale hybrid-attention reasoning model
Convert Google Gemini web into OpenAI-compatible API
FlashMLA: Efficient Multi-head Latent Attention Kernels
Open Multilingual Multimodal Chat LMs
Multimodal agent model for coding, orchestration, and autonomy
OpenAI’s compact 20B open model for fast, agentic, and local use
OpenAI’s open-weight 120B model optimized for reasoning and tooling
Dense multimodal Qwen model for coding, agents, and long context
Open multimodal model for coding, agents, and long-context tasks
Stable fine-tuned Gemma model for structured, clear responses
NVFP4 DiffusionGemma model for fast multimodal text generation
Open agentic coding model optimized for local deployment
Unified multimodal Gemma model for local coding and reasoning
Google’s flagship dense multimodal model for coding and reasoning
Omnimodal AI model for agents, coding, and long-context tasks
FP8 Qwen model for efficient multimodal coding and agent tasks
Compact 8B multimodal instruct model optimized for edge deployment