Diversity-driven optimization and large-model reasoning ability
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Large Multimodal Models for Video Understanding and Editing
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Open-weight, large-scale hybrid-attention reasoning model
Large-language-model & vision-language-model based on Linear Attention
Capable of understanding text, audio, vision, video
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
FlashMLA: Efficient Multi-head Latent Attention Kernels
Stable Diffusion with Core ML on Apple Silicon
Python example app from the OpenAI API quickstart tutorial
Towards Real-World Vision-Language Understanding
Pushing the Limits of Mathematical Reasoning in Open Language Models
The ChatGPT Retrieval Plugin lets you easily find personal documents
Chat & pretrained large audio language model proposed by Alibaba Cloud
Chat & pretrained large vision language model
High-Resolution Image Synthesis with Latent Diffusion Models
Release for Improved Denoising Diffusion Probabilistic Models
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
Open-source, high-performance Mixture-of-Experts large language model
Official DeiT repository
A Conversational Speech Generation Model
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
Open Multilingual Multimodal Chat LMs
Powerful open source image generation model