A Systematic Framework for Interactive World Modeling
GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Official repository for LTX-Video
Multimodal Diffusion with Representation Alignment
Open-source, high-performance AI model with advanced reasoning
Pokee Deep Research Model Open Source Repo
RGBD video generation model conditioned on camera input
Large Multimodal Models for Video Understanding and Editing
Safety reasoning models built-upon gpt-oss
CogView4, CogView3-Plus and CogView3(ECCV 2024)
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Contexts Optical Compression
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Python SDK for Claude Agent
Repo for SeedVR2 & SeedVR
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
FAIR Sequence Modeling Toolkit 2
Official implementation of Watermark Anything with Localized Messages
Fast and Universal 3D reconstruction model for versatile tasks
Example Discord bot written in Python that uses the completions API
Instructions on how to use the Realtime API on Microcontrollers
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
The ChatGPT Retrieval Plugin lets you easily find personal documents
LLM-based Reinforcement Learning audio edit model