CLIP, Predict the most relevant text snippet given an image
Transforming Multimodal Content into Captivating Multilingual Audio
Statusline plugin for vim with prompts for several other applications
A simple native web interface that uses ChatTTS to synthesize text
Tools to ease the creation of snippets, syntax definitions, etc.
Label Studio is a multi-type data labeling and annotation tool
A nearly-live implementation of OpenAI's Whisper
Framework for building realtime multimodal voice AI agents apps
A high-quality rapid TTS voice cloning model
Math OCR model that outputs LaTeX and markdown
Free, high-quality text-to-speech API endpoint to replace OpenAI
Full git and GitHub integration with Sublime Text
A Powerful Native Multimodal Model for Image Generation
Industrial-level controllable zero-shot text-to-speech system
Easy to use Python library for creating 2D arcade games
Deep Research framework, combining language models with tools
Spark-TTS Inference Code
Library for OCR-related tasks powered by Deep Learning
A Model Context Protocol (MCP) server
A library for converting HTML into PDFs using ReportLab
Compute distance between sequences
Snippet solution for Vim
Instagram OSINT tool for gathering profile data and public posts
Framework for building real-time voice and multimodal AI agents
SoTA open-source TTS