Implementation of "MobileCLIP" CVPR 2024
Qwen3 is the large language model series developed by Qwen team
Diffusion model(SD,Flux,Wan,Qwen Image,Z-Image,...) inference
The official repo of Qwen chat & pretrained large language model
Ultra-Efficient LLMs on End Device
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Memory-efficient and performant finetuning of Mistral's models
Fast-stable-diffusion + DreamBooth
Block Diffusion for Ultra-Fast Speculative Decoding
Audio foundation model excelling in audio understanding
Phi-3.5 for Mac: Locally-run Vision and Language Models
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Large Multimodal Models for Video Understanding and Editing
Multimodal Diffusion with Representation Alignment
Repo of Qwen2-Audio chat & pretrained large audio language model
The official PyTorch implementation of Google's Gemma models
Renderer for the harmony response format to be used with gpt-oss
Open-weight, large-scale hybrid-attention reasoning model
FAIR Sequence Modeling Toolkit 2
ICLR2024 Spotlight: curation/training code, metadata, distribution
Official implementation of DreamCraft3D
Towards Real-World Vision-Language Understanding
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Language modeling in a sentence representation space
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training