Document (PDF, Word, PPTX ...) extraction and parse API
High-performance inference server for text embeddings models API layer
Robust Speech Recognition via Large-Scale Weak Supervision
OCR model for complex documents with layout-aware structured outputs
Contexts Optical Compression
Open source semantic search and text analytics for large document sets
Agent harness to make your slop code well-engineered and beautiful
A high-quality PDF to Markdown tool based on large language model
AI-powered tool for generating, optimizing, and translating subtitles
Easily compute clip embeddings and build a clip retrieval system
Audiocraft is a library for audio processing and generation
Advanced NLP with spaCy: A free online course
End-to-end speech processing toolkit
Use Microsoft Edge's online text-to-speech service from Python
Lightning-fast, on-device TTS, running natively via ONNX
Framework for building realtime multimodal voice AI agents apps
Open speech-to-speech models and pipelines by Hugging Face toolkit AI
Chinese XLNet pre-trained model
Bidirectional token-classification model for identifiable info
Shared repository for open-sourced projects from the Google AI Lang
NLTK Source
AI-Powered Data Processing: Use LOTUS to process all of your datasets
Instantly generate AI-powered subtitles on your device
A system for agentic LLM-powered data processing and ETL
Open source NLP guide with models, methods, and real use cases