Framework for building AI-powered interactive digital humans and agent
Open source NLP guide with models, methods, and real use cases
go1pylib is a Python library designed to control the Go1 robot
Open Source Speech Language Model
Multimodal embedding and reranking models built on Qwen3-VL
Multimodal-Driven Architecture for Customized Video Generation
Build Vision Agents quickly with any model or video provider
Lightning-fast, on-device TTS, running natively via ONNX
GenAI Processors is a lightweight Python library
Long-form streaming TTS system for multi-speaker dialogue generation
Foundation model for image generation
Search all of YouTube from the command line
A list of free LLM inference resources accessible via API
Qwen3-ASR is an open-source series of ASR models
The Python code to reproduce illustrations from Machine Learning Book
Python library for scraping and analyzing online news articles easily
Code and models for ICML 2024 paper, NExT-GPT
Extract audio and video content and organize it into a Markdown note
Implementation of AudioLM audio generation model in Pytorch
A Multi-Modal World Model for Reconstructing, Generating, Simulation
Automated translation solution for visual novels
Autoregressive Model Beats Diffusion
StarVector is a foundation model for SVG generation
Foundational model for human-like, expressive TTS
Documentation for Google's Gen AI site - including Gemini API & Gemma