A security scanner for custom LLM applications
Generative AI reference workflows
The official repository for ERNIE 4.5 and ERNIEKit
Context management for Claude Code. Hooks maintain state via ledgers
Real-World Centric Foundation GUI Agents
Democratizing Reinforcement Learning for LLMs
Generate blog articles from video or audio
Open-sourced unified customization model
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
Collections of robotics environments
DeepMind model for tracking arbitrary points across videos & robotics
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Request recommended movies, TV shows and anime to Jellyseer/Overseer
This repo contains the code for 1D tokenizer and generator
Bailing is a voice dialogue robot similar to GPT-4o
Reading book source
Interface for OuteTTS models
Plug-and-play library to enable agents to call MCP and UTCP tools
Get started w/ building Fullstack Agents using Gemini 2.5 & LangGraph
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
High-Fidelity and Controllable Generation of Textured 3D Assets
An Open Source text-to-speech system built by inverting Whisper
Towards Human-Sounding Speech
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
GUI Exploration Lab. One of the best GUI agent solutions