ChatGPT interface with better UI
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Generate Any 3D Scene in Seconds
Release for Improved Denoising Diffusion Probabilistic Models
Hackable and optimized Transformers building blocks
PyTorch code and models for the DINOv2 self-supervised learning
Official implementation of DreamCraft3D
Controllable & emotion-expressive zero-shot TTS
Capable of understanding text, audio, vision, video
Tiny vision language model
This repository contains the official implementation of FastVLM
Pushing the Limits of Mathematical Reasoning in Open Language Models
Research code artifacts for Code World Model (CWM)
OCR expert VLM powered by Hunyuan's native multimodal architecture
A Unified Framework for Text-to-3D and Image-to-3D Generation
Diversity-driven optimization and large-model reasoning ability
Tool for exploring and debugging transformer model behaviors
GLM-4 series: Open Multilingual Multimodal Chat LMs
Foundation Models for Time Series
A Production-ready Reinforcement Learning AI Agent Library
tiktoken is a fast BPE tokeniser for use with OpenAI's models
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Large Multimodal Models for Video Understanding and Editing
Real-time behaviour synthesis with MuJoCo, using Predictive Control
Provides convenient access to the Anthropic REST API from any Python 3