Accurate × Fast × Comprehensive
Open-Source Financial Large Language Models
Programmatic access to the AlphaGenome model
Code for running inference with the SAM 3D Body Model 3DB
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Video Object and Interaction Deletion
MOSS‑TTS Family open‑source speech and sound generation model
Achieving 3+ generation speedup on reasoning tasks
Easy Docker setup for Stable Diffusion with user-friendly UI
Revolutionizing Database Interactions with Private LLM Technology
A Powerful Native Multimodal Model for Image Generation
Generating Immersive, Explorable, and Interactive 3D Worlds
Phi-3.5 for Mac: Locally-run Vision and Language Models
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
Fast-stable-diffusion + DreamBooth
CLIP, Predict the most relevant text snippet given an image
HY-Motion model for 3D character animation generation
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
A Multi-Modal World Model for Reconstructing, Generating, Simulation
Industrial-level controllable zero-shot text-to-speech system
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Renderer for the harmony response format to be used with gpt-oss
A series of math-specific large language models of our Qwen2 series
Pretrained time-series foundation model developed by Google Research
An Efficient Agentic Model for Computer Use