Implementation of Imagen, Google's Text-to-Image Neural Network
A simple, high-quality voice conversion tool focused on ease of use
Claude Code skill implementing Manus-style persistent planning
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Easy-to-use and powerful NLP library with Awesome model zoo
Transforming Multimodal Content into Captivating Multilingual Audio
Tools to ease the creation of snippets, syntax definitions, etc.
A simple native web interface that uses ChatTTS to synthesize text
CLIP, Predict the most relevant text snippet given an image
Label Studio is a multi-type data labeling and annotation tool
Statusline plugin for vim with prompts for several other applications
A nearly-live implementation of OpenAI's Whisper
Framework for building realtime multimodal voice AI agents apps
A high-quality rapid TTS voice cloning model
Free, high-quality text-to-speech API endpoint to replace OpenAI
Math OCR model that outputs LaTeX and markdown
Full git and GitHub integration with Sublime Text
A Powerful Native Multimodal Model for Image Generation
Deep Research framework, combining language models with tools
Industrial-level controllable zero-shot text-to-speech system
Easy to use Python library for creating 2D arcade games
A general purpose syntax highlighter in pure Go
Spark-TTS Inference Code
Library for OCR-related tasks powered by Deep Learning
A Model Context Protocol (MCP) server