Multimodal Diffusion with Representation Alignment
GLM-4 series: Open Multilingual Multimodal Chat LMs
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Long-form streaming TTS system for multi-speaker dialogue generation
A series of math-specific large language models of our Qwen2 series
Generating Immersive, Explorable, and Interactive 3D Worlds
HY-Motion model for 3D character animation generation
A PyTorch library for implementing flow matching algorithms
PyTorch code and models for the DINOv2 self-supervised learning
A state-of-the-art open visual language model
Inference script for Oasis 500M
Foundation Models for Time Series
ICLR2024 Spotlight: curation/training code, metadata, distribution
Memory-efficient and performant finetuning of Mistral's models
Large-language-model & vision-language-model based on Linear Attention
Pokee Deep Research Model Open Source Repo
Advancing Formal Mathematical Reasoning via Reinforcement Learning
Repo of Qwen2-Audio chat & pretrained large audio language model
OCR expert VLM powered by Hunyuan's native multimodal architecture
The ChatGPT Retrieval Plugin lets you easily find personal documents
Release for Improved Denoising Diffusion Probabilistic Models
Official DeiT repository
Python example app from the OpenAI API quickstart tutorial
High-Resolution Image Synthesis with Latent Diffusion Models
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation