This repo contains the code for 1D tokenizer and generator
A framework to enable multimodal models to operate a computer
Witness the aha moment of VLM with less than $3
An open phone agent model & framework
The most powerful Android RPA agent framework
LTX-Video Support for ComfyUI
Reference PyTorch implementation and models for DINOv3
Unified Multimodal Understanding and Generation Models
Extensible workflow development framework
The library to build & auto-optimize LLM applications
Just a Better Chatbot. Powered by MCP Client & Workflows
Generating Immersive, Explorable, and Interactive 3D Worlds
SAPIEN Manipulation Skill Framework
Python inference and LoRA trainer package for the LTX-2 audio–video
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Lightning fast C++/CUDA neural network framework
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Open source MVVM framework for Web Apps
Open-source framework for conversational voice AI agents
Interactively analyze ML models to understand their behavior
Build programmatically custom agentic workflows, AI Agents, RAG system
Virtual AI anchor that combines state-of-the-art technology
Taming Stable Diffusion for Lip Sync
ICLR2024 Spotlight: curation/training code, metadata, distribution
[CVPR 2025 Best Paper Award] VGGT