New family of code large language models (LLMs)
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
DeepMind model for tracking arbitrary points across videos & robotics
CodeGeeX2: A More Powerful Multilingual Code Generation Model
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
FAIR Sequence Modeling Toolkit 2
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Qwen3-omni is a natively end-to-end, omni-modal LLM
Capable of understanding text, audio, vision, video
A state-of-the-art open visual language model
Large Multimodal Models for Video Understanding and Editing
LLM-based Reinforcement Learning audio edit model
Chat & pretrained large audio language model proposed by Alibaba Cloud
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
Official code for Style Aligned Image Generation via Shared Attention
Code for the paper Hybrid Spectrogram and Waveform Source Separation
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
A latent text-to-image diffusion model
Reference implementation of the Transformer architecture optimized
Per-Pixel Classification is Not All You Need for Semantic Segmentation
Tencent’s 36-language state-of-the-art translation model