Qwen3-ASR is an open-source series of ASR models
Block Diffusion for Ultra-Fast Speculative Decoding
tiktoken is a fast BPE tokeniser for use with OpenAI's models
State-of-the-art (SoTA) text-to-video pre-trained model
An AI-powered security review GitHub Action using Claude
Collection of Gemma 3 variants that are trained for performance
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Foundation Models for Time Series
Implementation of the Surya Foundation Model for Heliophysics
Sharp Monocular Metric Depth in Less Than a Second
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Generate Any 3D Scene in Seconds
Contexts Optical Compression
Ling is a MoE LLM provided and open-sourced by InclusionAI
Qwen3-omni is a natively end-to-end, omni-modal LLM
Controllable & emotion-expressive zero-shot TTS
Global weather forecasting model using graph neural networks and JAX
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
GPT4V-level open-source multi-modal model based on Llama3-8B
Foundation model for image generation
A Pragmatic VLA Foundation Model
Diversity-driven optimization and large-model reasoning ability
Chat & pretrained large vision language model
General-purpose image editing model that delivers high-fidelity