CLIP, Predict the most relevant text snippet given an image
Achieving 3+ generation speedup on reasoning tasks
Pretrained time-series foundation model developed by Google Research
Hackable and optimized Transformers building blocks
Stable Diffusion with Core ML on Apple Silicon
Open Source Speech Language Model
Open-source industrial-grade ASR models
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
Global weather forecasting model using graph neural networks and JAX
code for Mesh R-CNN, ICCV 2019
Language modeling in a sentence representation space
A SOTA open-source image editing model
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
Audio foundation model excelling in audio understanding
Phi-3.5 for Mac: Locally-run Vision and Language Models
Open-source framework for intelligent speech interaction
Large-language-model & vision-language-model based on Linear Attention
Pokee Deep Research Model Open Source Repo
Implementation of the Surya Foundation Model for Heliophysics
Long-form streaming TTS system for multi-speaker dialogue generation
Qwen3-ASR is an open-source series of ASR models
OpenTinker is an RL-as-a-Service infrastructure for foundation models
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
High-Fidelity and Controllable Generation of Textured 3D Assets