An Open Source implementation of Notebook LM with more flexibility
Open-Sora: Democratizing Efficient Video Production for All
Open source semantic search and text analytics for large document sets
An open phone agent model & framework
Recognition and resolution of numbers, units, date/time, etc.
Provides line-oriented text file editing capabilities
Build Vision Agents quickly with any model or video provider
Open source AI VTuber platform with voice chat and Live2D avatars
Text generator is a handy plugin for Obsidian
Pre-trained Deep Learning models and demos
Module for automatic summarization of text documents and HTML pages
Large Language Model Text Generation Inference
Oobabooga - The definitive Web UI for local AI, with powerful features
A playground to generate images from any text prompt using SD
Document (PDF, Word, PPTX ...) extraction and parse API
A pure Javascript Multilingual OCR
Agent Skill for generating 2D sprite sheets and map, transparent PNG
High-performance inference server for text embeddings models API layer
Hypernetworks that adapt LLMs for specific benchmark tasks
JavaScript OCR and text extraction for images and PDFs
Open Source OCR Engine
Speech-to-text, text-to-speech, and speaker recognition
Comprehensive Gradio WebUI for audio processing
A free, open source, and extensible speech-to-text application
OCRmyPDF adds an OCR text layer to scanned PDF files