Audiocraft is a library for audio processing and generation
Context-aware desktop AI assistant that understands screen content
Tools for manipulating datasets
OCR model for complex documents with layout-aware structured outputs
Offline Text To Speech synthesis for python
Persian NLP Toolkit
Offline inference engine for art, real-time voice conversations
Converts text to speech in realtime
Stable Diffusion WebUI optimized for AMD GPUs with editing tools
A Unified Framework for Text-to-3D and Image-to-3D Generation
Framework for building realtime multimodal voice AI agents apps
Handwritten Text Recognition (HTR) system implemented with TensorFlow
Over 425 terminal color schemes/themes for iTerm/iTerm2
Official MiniMax Model Context Protocol (MCP) server
Open-Sora: Democratizing Efficient Video Production for All
Industrial-level controllable zero-shot text-to-speech system
CineCLI is a cross-platform command-line movie browser
Powerful Android AI agent with tools, automation, and Linux shell
toot - Mastodon CLI & TUI
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
Qwen3-ASR is an open-source series of ASR models
LLM abstractions that aren't obstructions
Stanford NLP Python library for many human languages
Lightweight Markdown-only skills for autonomous ML research
RAG-Anything: All-in-One RAG Framework