GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Large Multimodal Models for Video Understanding and Editing
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Pushing the Limits of Mathematical Reasoning in Open Language Models
LLM-based Reinforcement Learning audio edit model
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
Inference code for scalable emulation of protein equilibrium ensembles
Fast-stable-diffusion + DreamBooth
Hunyuan Translation Model Version 1.5
Tool for exploring and debugging transformer model behaviors
Fast and Universal 3D reconstruction model for versatile tasks
Stable Diffusion with Core ML on Apple Silicon
The ChatGPT Retrieval Plugin lets you easily find personal documents
Open-source large language model family from Tencent Hunyuan
Open Source Speech Language Model
Open-source industrial-grade ASR models
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
A SOTA open-source image editing model
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
Open-weight, large-scale hybrid-attention reasoning model
OCR expert VLM powered by Hunyuan's native multimodal architecture
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Open-source framework for intelligent speech interaction
The official PyTorch implementation of Google's Gemma models