Lets make video diffusion practical
Tool for exploring and debugging transformer model behaviors
A state-of-the-art open visual language model
Open-weight, large-scale hybrid-attention reasoning model
Qwen3-omni is a natively end-to-end, omni-modal LLM
PyTorch code and models for the DINOv2 self-supervised learning
Towards Ultimate Expert Specialization in Mixture-of-Experts Language
Official implementation of DreamCraft3D
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
OCR expert VLM powered by Hunyuan's native multimodal architecture
ChatGPT interface with better UI
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
Global weather forecasting model using graph neural networks and JAX
An AI-powered security review GitHub Action using Claude
Dataset of GPT-2 outputs for research in detection, biases, and more
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Inference code for scalable emulation of protein equilibrium ensembles
Programmatic access to the AlphaGenome model
Chat & pretrained large vision language model
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Qwen2.5-VL is the multimodal large language model series
Implementation of "MobileCLIP" CVPR 2024
High-resolution models for human tasks
Video understanding codebase from FAIR for reproducing video models
Ling is a MoE LLM provided and open-sourced by InclusionAI