RGBD video generation model conditioned on camera input
Dramatron uses large language models to generate coherent scripts
Industrial-level controllable zero-shot text-to-speech system
Synchronized Translation for Videos
Memory engine and app that is extremely fast, scalable
A high-performance distributed file system
From Images to High-Fidelity 3D Assets
Open source full-stack AI vibe coding platform & web app generator
A nearly-live implementation of OpenAI's Whisper
Controllable & emotion-expressive zero-shot TTS
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
Contexts Optical Compression
Kubernetes native framework for building AI agents
A single Gradio + React WebUI with extensions for ACE-Step
An Open Source text-to-speech system built by inverting Whisper
ByteHook is an Android PLT hook library
Fast backend for long-term AI user memory via structured profiles
Python SDK for Claude Agent
Minimal Claude Code alternative. Single Python file, zero dependencies
Machine Learning Systems: Design and Implementation
An alignment auditing agent capable of exploring alignment hypothesis
NVIDIA Federated Learning Application Runtime Environment
Open source text-to-speech tool, supports extra-long text
Code to accompany "A Method for Animating Children's Drawings"
Real-World Centric Foundation GUI Agents