An AI agent development platform with all-in-one visual tools
No-code multi-agent framework to build LLM Agents, workflows
Open-source platform for building enterprise-grade agents
Official implementation of DreamCraft3D
OCR expert VLM powered by Hunyuan's native multimodal architecture
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Towards Real-World Vision-Language Understanding
Multimodal Diffusion with Representation Alignment
Converts text to speech in realtime
Code for Cicero, an AI agent that plays the game of Diplomacy
A fast TTS architecture with conditional flow matching
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Code for Language models can explain neurons in language models paper
High-level, high-performance dynamic language for technical computing
Doom-based AI research platform for reinforcement learning
A GPU-accelerated library containing highly optimized building blocks
Local Lambda debug, CodeWhisperer, SAM/CFN syntax, etc.
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Concatenate a directory full of files into a single prompt
Flexible Photo Recrafting While Preserving Your Identity
Document Image Parsing via Heterogeneous Anchor Prompting”
Framework for building neural networks
A secure sandbox environment for malware developers and red teamers
A Model Context Protocol server for searching and analyzing arXiv
ICLR2024 Spotlight: curation/training code, metadata, distribution