Data processing for and with foundation models
A Repo For Document AI
ReFT: Representation Finetuning for Language Models
Efficient few-shot learning with Sentence Transformers
The no-nonsense RAG chunking library
Training data (data labeling, annotation, workflow) for all data types
Han Language Processing
Toolkit for conversational AI
Stanford NLP Python library for many human languages
The Classical Language Toolkit
A full spaCy pipeline and models for scientific/biomedical documents
An LLM-powered knowledge curation system that researches topics
A natural language interface for computers
Large Language Model Text Generation Inference
ExtractThinker is a Document Intelligence library for LLMs
A Heterogeneous Benchmark for Information Retrieval
Efficient Retrieval Augmentation and Generation Framework
An easy-to-use LLMs quantization package with user-friendly apis
Neural Network Compression Framework for enhanced OpenVINO
Extract schema, statistics and entities from datasets
WikiChat is an improved RAG
Semantic search and workflows for medical/scientific papers
Easy-to-use and powerful NLP library with Awesome model zoo
Data and tools for generating and inspecting OLMo pre-training data
Sparsity-aware deep learning inference runtime for CPUs