A Systematic Framework for Interactive World Modeling
Wan2.1: Open and Advanced Large-Scale Video Generative Model
Official repository for LTX-Video
Multimodal Diffusion with Representation Alignment
Pokee Deep Research Model Open Source Repo
Open-source, high-performance AI model with advanced reasoning
RGBD video generation model conditioned on camera input
Large Multimodal Models for Video Understanding and Editing
CogView4, CogView3-Plus and CogView3(ECCV 2024)
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Contexts Optical Compression
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Python SDK for Claude Agent
Repo for SeedVR2 & SeedVR
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
FAIR Sequence Modeling Toolkit 2
Official implementation of Watermark Anything with Localized Messages
Example Discord bot written in Python that uses the completions API
Fast and Universal 3D reconstruction model for versatile tasks
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
The ChatGPT Retrieval Plugin lets you easily find personal documents
LLM-based Reinforcement Learning audio edit model
Chat & pretrained large vision language model
A SOTA open-source image editing model
Multi-modal large language model designed for audio understanding