CogView4, CogView3-Plus and CogView3(ECCV 2024)
LTX-Video Support for ComfyUI
Easy Docker setup for Stable Diffusion with user-friendly UI
FAIR Sequence Modeling Toolkit 2
Stable Diffusion with Core ML on Apple Silicon
The Clay Foundation Model - An open source AI model and interface
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Large Multimodal Models for Video Understanding and Editing
Unified Multimodal Understanding and Generation Models
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
A series of math-specific large language models of our Qwen2 series
Tiny vision language model
Programmatic access to the AlphaGenome model
Open Source Speech Language Model
Open-source industrial-grade ASR models
Recovering the Visual Space from Any Views
Advancing Open-source World Models
Hunyuan Translation Model Version 1.5
A SOTA open-source image editing model
A Production-ready Reinforcement Learning AI Agent Library
GPT4V-level open-source multi-modal model based on Llama3-8B
Open-source large language model family from Tencent Hunyuan
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
A Pragmatic VLA Foundation Model