Powerful Android AI agent with tools, automation, and Linux shell
Fast multimodal LLM for real-time voice interaction and AI apps
Autoregressive Model Beats Diffusion
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Diffusion Transformer with Fine-Grained Chinese Understanding
Real-time voice interactive digital human
OCR expert VLM powered by Hunyuan's native multimodal architecture
HY-Motion model for 3D character animation generation
21 Lessons, Get Started Building with Generative AI
LLM abstractions that aren't obstructions
Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD
Framework for building AI-powered interactive digital humans and agent
End-to-end speech processing toolkit
A TTS model capable of generating ultra-realistic dialogue
User toolkit for analyzing and interfacing with Large Language Models
AutoGluon: AutoML for Image, Text, and Tabular Data
Visual Causal Flow
Stable Diffusion built-in to Blender
AutoAgent: Fully-Automated and Zero-Code LLM Agent Framework
SOTA discrete acoustic codec models with 40/75 tokens per second
Sample code and notebooks for Generative AI on Google Cloud
Unified Multimodal Understanding and Generation Models
The ultimate RAG for your monorepo
Flexible Photo Recrafting While Preserving Your Identity
Bailing is a voice dialogue robot similar to GPT-4o