RGBD video generation model conditioned on camera input
Code for running inference with the SAM 3D Body Model 3DB
GLM-4 series: Open Multilingual Multimodal Chat LMs
Block Diffusion for Ultra-Fast Speculative Decoding
LTX-Video Support for ComfyUI
Official repository for LTX-Video
Models for object and human mesh reconstruction
The official repo of Qwen chat & pretrained large language model
Bidirectional token-classification model for identifiable info
Repo for SeedVR2 & SeedVR
Video Object and Interaction Deletion
Accurate × Fast × Comprehensive
The ChatGPT Retrieval Plugin lets you easily find personal documents
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
Contexts Optical Compression
Sharp Monocular Metric Depth in Less Than a Second
Project Lyra: Open Generative 3D World Models
Open-source deep-learning framework
State-of-the-art TTS model under 25MB
PyTorch code and models for the DINOv2 self-supervised learning
GLM-4-Voice | End-to-End Chinese-English Conversational Model
A Customizable Image-to-Video Model based on HunyuanVideo
An AI-powered security review GitHub Action using Claude
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
Python bindings for llama.cpp