Inference script for Oasis 500M
Memory-efficient and performant finetuning of Mistral's models
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Uncommon Objects in 3D dataset
GPT4V-level open-source multi-modal model based on Llama3-8B
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
The ChatGPT Retrieval Plugin lets you easily find personal documents
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Multi-modal large language model designed for audio understanding
Qwen3-omni is a natively end-to-end, omni-modal LLM
Release for Improved Denoising Diffusion Probabilistic Models
High-Resolution Image Synthesis with Latent Diffusion Models
Open-source, high-performance Mixture-of-Experts large language model
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Open Multilingual Multimodal Chat LMs
Towards Ultimate Expert Specialization in Mixture-of-Experts Language
Code for the paper Hybrid Spectrogram and Waveform Source Separation
Official repo for consistency models
Chinese LLaMA & Alpaca large language model + local CPU/GPU training
Official PyTorch Implementation of "Scalable Diffusion Models"
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
A method to increase the speed and lower the memory footprint
LLaMA: Open and Efficient Foundation Language Models