An Open Source implementation of Notebook LM with more flexibility
Open-Sora: Democratizing Efficient Video Production for All
An open phone agent model & framework
Build Vision Agents quickly with any model or video provider
Provides line-oriented text file editing capabilities
Open source AI VTuber platform with voice chat and Live2D avatars
Pre-trained Deep Learning models and demos
Module for automatic summarization of text documents and HTML pages
Large Language Model Text Generation Inference
Oobabooga - The definitive Web UI for local AI, with powerful features
Document (PDF, Word, PPTX ...) extraction and parse API
Agent Skill for generating 2D sprite sheets and map, transparent PNG
High-performance inference server for text embeddings models API layer
Hypernetworks that adapt LLMs for specific benchmark tasks
Ready-to-use OCR with 80+ supported languages
AI tool that removes hardcoded subtitles and text from videos locally
Comprehensive Gradio WebUI for audio processing
A GUI tool for extracting hard-coded subtitle (hardsub) from videos
OCRmyPDF adds an OCR text layer to scanned PDF files
Focus on prompting and generating
TTS with kokoro and onnx runtime
Python tool for converting files and office documents to Markdown
Awesome multilingual OCR toolkits based on PaddlePaddle
Qwen3-TTS is an open-source series of TTS models
The most powerful and modular diffusion model GUI, api and backend