Recognition and resolution of numbers, units, date/time, etc.
Large Language Model Text Generation Inference
Document (PDF, Word, PPTX ...) extraction and parse API
Module for automatic summarization of text documents and HTML pages
High-performance inference server for text embeddings models API layer
Stanford CoreNLP, a Java suite of core NLP tools
AI tool that removes hardcoded subtitles and text from videos locally
Modest natural-language processing
Connect MATLAB to LLM APIs, including OpenAI® Chat Completions
Persian NLP Toolkit
Han Language Processing
Robust Speech Recognition via Large-Scale Weak Supervision
A full spaCy pipeline and models for scientific/biomedical documents
General natural language facilities for node
Underthesea - Vietnamese NLP Toolkit
Open source healthcare AI
OCR model for complex documents with layout-aware structured outputs
Text mining using tidy tools
The pluggable natural language linter for text and markdown
Contexts Optical Compression
OCRmyPDF adds an OCR text layer to scanned PDF files
Comprehensive Gradio WebUI for audio processing
Toolkit for conversational AI
The most accurate natural language detection library for Python
A persistent, network resilient, full text search library