Controllable & emotion-expressive zero-shot TTS
The common language for platforms, agents and businesses.
Real-World Centric Foundation GUI Agents
Context data platform for building observable, self-learning AI agents
Democratizing Reinforcement Learning for LLMs
Generate blog articles from video or audio
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
SOTA discrete acoustic codec models with 40/75 tokens per second
One-click deployment (including offline integration package)
A TTS model capable of generating ultra-realistic dialogue
Self hosted & open source anonymous 360 review software
A collection of learning resources for curious software engineers
An Efficient, Scalable, Multi-Modality RL Training Framework
An SSH/Telnet/Serial client in your browser
Pokee Deep Research Model Open Source Repo
Unified Multimodal Understanding and Generation Models
Volcano Engine Reinforcement Learning for LLMs
An alignment auditing agent capable of exploring alignment hypothesis
Expose your FastAPI endpoints as Model Context Protocol (MCP) tools
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Collection of common code shared among different research projects
Language modeling in a sentence representation space
Renderer for the harmony response format to be used with gpt-oss
A Powerful Native Multimodal Model for Image Generation
Implementation of the Surya Foundation Model for Heliophysics