Stanford NLP Python library for many human languages
Data processing for and with foundation models
A Repo For Document AI
The Classical Language Toolkit
Extract schema, statistics and entities from datasets
ReFT: Representation Finetuning for Language Models
The no-nonsense RAG chunking library
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
Superlinked is a Python framework for AI Engineers
A full spaCy pipeline and models for scientific/biomedical documents
Efficient few-shot learning with Sentence Transformers
An LLM-powered knowledge curation system that researches topics
ExtractThinker is a Document Intelligence library for LLMs
A Heterogeneous Benchmark for Information Retrieval
A natural language interface for computers
WikiChat is an improved RAG
The most accurate natural language detection library for Python
Large Language Model Text Generation Inference
Efficient Retrieval Augmentation and Generation Framework
An easy-to-use LLMs quantization package with user-friendly apis
Semantic search and workflows for medical/scientific papers
Neural Network Compression Framework for enhanced OpenVINO
Underthesea - Vietnamese NLP Toolkit
Toolkit for conversational AI
Data and tools for generating and inspecting OLMo pre-training data