Industrial-strength Natural Language Processing (NLP)
A natural language interface for computers
The Classical Language Toolkit
ReFT: Representation Finetuning for Language Models
Stanford NLP Python library for many human languages
Semantic search and workflows for medical/scientific papers
ExtractThinker is a Document Intelligence library for LLMs
The no-nonsense RAG chunking library
Data and tools for generating and inspecting OLMo pre-training data
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
Extract schema, statistics and entities from datasets
Superlinked is a Python framework for AI Engineers
A Repo For Document AI
Large Language Model Text Generation Inference
Efficient Retrieval Augmentation and Generation Framework
An LLM-powered knowledge curation system that researches topics
The library to build & auto-optimize LLM applications
Trained models & code to predict toxic comments
Han Language Processing
A Heterogeneous Benchmark for Information Retrieval
Underthesea - Vietnamese NLP Toolkit
The most accurate natural language detection library for Python
Toolkit for conversational AI
Data processing for and with foundation models
Module for automatic summarization of text documents and HTML pages