PyTorch code and models for the DINOv2 self-supervised learning
Memory-efficient and performant finetuning of Mistral's models
Towards Ultimate Expert Specialization in Mixture-of-Experts Language
Official implementation of DreamCraft3D
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Open-source large language model family from Tencent Hunyuan
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
Unified Multimodal Understanding and Generation Models
Sharp Monocular Metric Depth in Less Than a Second
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Language modeling in a sentence representation space
An AI-powered security review GitHub Action using Claude
The ChatGPT Retrieval Plugin lets you easily find personal documents
A series of math-specific large language models of our Qwen2 series
Implementation of the Surya Foundation Model for Heliophysics
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
A state-of-the-art open visual language model
Chat & pretrained large audio language model proposed by Alibaba Cloud
Qwen3-omni is a natively end-to-end, omni-modal LLM
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
FlashMLA: Efficient Multi-head Latent Attention Kernels
High-Resolution Image Synthesis with Latent Diffusion Models
Code for the paper Hybrid Spectrogram and Waveform Source Separation