A PyTorch library for implementing flow matching algorithms
tiktoken is a fast BPE tokeniser for use with OpenAI's models
A Customizable Image-to-Video Model based on HunyuanVideo
Qwen3-omni is a natively end-to-end, omni-modal LLM
A series of math-specific large language models of our Qwen2 series
An Efficient Agentic Model for Computer Use
A state-of-the-art open visual language model
Long-form streaming TTS system for multi-speaker dialogue generation
Fast-stable-diffusion + DreamBooth
High-resolution models for human tasks
Video understanding codebase from FAIR for reproducing video models
CLIP, Predict the most relevant text snippet given an image
Achieving 3+ generation speedup on reasoning tasks
Pretrained time-series foundation model developed by Google Research
Generate Any 3D Scene in Seconds
Diffusion Transformer with Fine-Grained Chinese Understanding
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
Controllable & emotion-expressive zero-shot TTS
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
Tooling for the Common Objects In 3D dataset
code for Mesh R-CNN, ICCV 2019
Language modeling in a sentence representation space
An AI-powered security review GitHub Action using Claude
Provides convenient access to the Anthropic REST API from any Python 3
Generating Immersive, Explorable, and Interactive 3D Worlds