Qwen2.5-VL is the multimodal large language model series
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Implementation of "MobileCLIP" CVPR 2024
Chinese and English multimodal conversational language model
The official repo of Qwen chat & pretrained large language model
Repo of Qwen2-Audio chat & pretrained large audio language model
Official code for Style Aligned Image Generation via Shared Attention
Memory-efficient and performant finetuning of Mistral's models
Pushing the Limits of Mathematical Reasoning in Open Language Models
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Unified Multimodal Understanding and Generation Models
Phi-3.5 for Mac: Locally-run Vision and Language Models
Renderer for the harmony response format to be used with gpt-oss
The official PyTorch implementation of Google's Gemma models
Multimodal Diffusion with Representation Alignment
FAIR Sequence Modeling Toolkit 2
ICLR2024 Spotlight: curation/training code, metadata, distribution
Official implementation of DreamCraft3D
Language modeling in a sentence representation space
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
A Conversational Speech Generation Model
High-Resolution Image Synthesis with Latent Diffusion Models
Let us control diffusion models
Chinese LLaMA & Alpaca large language model + local CPU/GPU training
Repo for external large-scale work