DeepMind model for tracking arbitrary points across videos & robotics
code for Mesh R-CNN, ICCV 2019
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Language modeling in a sentence representation space
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
Dataset of GPT-2 outputs for research in detection, biases, and more
The ChatGPT Retrieval Plugin lets you easily find personal documents
Designed for text embedding and ranking tasks
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
Chat & pretrained large audio language model proposed by Alibaba Cloud
Qwen2.5-VL is the multimodal large language model series
A state-of-the-art open visual language model
Chinese and English multimodal conversational language model
Capable of understanding text, audio, vision, video
GLM-4 series: Open Multilingual Multimodal Chat LMs
High-Resolution Image Synthesis with Latent Diffusion Models
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
Code for the paper Hybrid Spectrogram and Waveform Source Separation
Powerful open source image generation model
Suite with Real-ESRGAN, BSRGAN , IRCNN, GFPGAN & RIFE. v4.3
Open-Source Financial Large Language Models!
A Conversational Speech Generation Model
Let us control diffusion models