Visual Causal Flow
Reference PyTorch implementation and models for DINOv3
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Qwen-Image is a powerful image generation foundation model
Open-source multi-speaker long-form text-to-speech model
Python SDK for Claude Agent
Qwen2.5-VL is the multimodal large language model series
HY-Motion model for 3D character animation generation
Official repository for LTX-Video
State-of-the-art (SoTA) text-to-video pre-trained model
State-of-the-art TTS model under 25MB
From Images to High-Fidelity 3D Assets
Video understanding codebase from FAIR for reproducing video models
Industrial-level controllable zero-shot text-to-speech system
Generating Immersive, Explorable, and Interactive 3D Worlds
Inference framework for 1-bit LLMs
gpt-oss-120b and gpt-oss-20b are two open-weight language models
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Release for Improved Denoising Diffusion Probabilistic Models
Tiny vision language model
GPT4V-level open-source multi-modal model based on Llama3-8B
ChatGLM-6B: An Open Bilingual Dialogue Language Model
ChatGPT interface with better UI
A series of math-specific large language models of our Qwen2 series
Programmatic access to the AlphaGenome model