ASCII art library for Python
AI-powered tool for generating, optimizing, and translating subtitles
Generating Immersive, Explorable, and Interactive 3D Worlds
State-of-the-art (SoTA) text-to-video pre-trained model
Unifying 3D Mesh Generation with Language Models
Python bindings for MuPDF's rendering library.
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
A fast TTS architecture with conditional flow matching
Math OCR model that outputs LaTeX and markdown
Speech recognition module for Python
Implementation of Imagen, Google's Text-to-Image Neural Network
Python binding to the Apache Tika™ REST services
A community-supported supercharged version of paperless
The behavior guidance framework for customer-facing LLM agents
Python & command-line tool to gather text on the Web
Implementation of Phenaki Video, which uses Mask GIT
An Open Source text-to-speech system built by inverting Whisper
Generate blog articles from video or audio
A text-to-speech, speech-to-text and speech-to-speech library
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Qwen-Image is a powerful image generation foundation model
Persian NLP Toolkit
A TTS that fits in your CPU (and pocket)
An open-source toolkit for monitoring Language Learning Models (LLMs)
A Sublime Text 2/3 plugin to see git diff in gutter