Unified Multimodal Understanding and Generation Models
GLM-4 series: Open Multilingual Multimodal Chat LMs
LLM-based Reinforcement Learning audio edit model
A state-of-the-art open visual language model
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Foundation model for image generation
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Ling is a MoE LLM provided and open-sourced by InclusionAI
Qwen2.5-VL is the multimodal large language model series
Diversity-driven optimization and large-model reasoning ability
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Controllable & emotion-expressive zero-shot TTS
Global weather forecasting model using graph neural networks and JAX
Sharp Monocular Metric Depth in Less Than a Second
Tooling for the Common Objects In 3D dataset
Renderer for the harmony response format to be used with gpt-oss
Tiny vision language model
The official PyTorch implementation of Google's Gemma models
General-purpose image editing model that delivers high-fidelity
Fast and Universal 3D reconstruction model for versatile tasks
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Pushing the Limits of Mathematical Reasoning in Open Language Models
Open-source large language model family from Tencent Hunyuan
Chat & pretrained large vision language model
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning