Bidirectional token-classification model for identifiable info
Video Object and Interaction Deletion
Tool for exploring and debugging transformer model behaviors
Hackable and optimized Transformers building blocks
Research code artifacts for Code World Model (CWM)
Open-source large language model family from Tencent Hunyuan
A series of math-specific large language models of our Qwen2 series
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Qwen3-omni is a natively end-to-end, omni-modal LLM
An Efficient Agentic Model for Computer Use
Phi-3.5 for Mac: Locally-run Vision and Language Models
Programmatic access to the AlphaGenome model
Long-form streaming TTS system for multi-speaker dialogue generation
Collection of Gemma 3 variants that are trained for performance
Ling is a MoE LLM provided and open-sourced by InclusionAI
Multimodal Diffusion with Representation Alignment
Easy Docker setup for Stable Diffusion with user-friendly UI
HY-Motion model for 3D character animation generation
This repository contains the official implementation of FastVLM
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Diversity-driven optimization and large-model reasoning ability
CodeGeeX2: A More Powerful Multilingual Code Generation Model
Large-language-model & vision-language-model based on Linear Attention
The official PyTorch implementation of Google's Gemma models