PDF to Markdown with vision models
Open speech-to-speech models and pipelines by Hugging Face toolkit AI
Framework for building realtime multimodal voice AI agents apps
TextWorld is a sandbox learning environment for the training
Shared repository for open-sourced projects from the Google AI Lang
Dealing with all unstructured data, such as reverse image search
Document content and metadata extraction microservice
Bidirectional token-classification model for identifiable info
NLTK Source
AI-Powered Data Processing: Use LOTUS to process all of your datasets
Stable Diffusion web UI
A system for agentic LLM-powered data processing and ETL
Open source NLP guide with models, methods, and real use cases
Efficient few-shot learning with Sentence Transformers
Stanford NLP Python library for many human languages
A Model Context Protocol (MCP) server
Stable Diffusion built-in to Blender
OpenRecall is a fully open-source, privacy-first alternative
Open Source Speech Language Model
Low-latency AI inference engine optimized for mobile devices
Cloud-native open source data warehouse for analytics and AI queries
SQL-Driven RAG Engine
Open-source multi-speaker long-form text-to-speech model
Framework for building real-time voice and multimodal AI agents
Towards Human-Sounding Speech