The Classical Language Toolkit
Sparsity-aware deep learning inference runtime for CPUs
Han Language Processing
Industrial-strength Natural Language Processing (NLP)
Efficient Retrieval Augmentation and Generation Framework
Unified embedding model
Large Language Model Text Generation Inference
Evaluation code for various unsupervised automated metrics
Toolkit for conversational AI
Data and tools for generating and inspecting OLMo pre-training data
ReFT: Representation Finetuning for Language Models
Training data (data labeling, annotation, workflow) for all data types
Pretrained model hub for Keras 3
ExtractThinker is a Document Intelligence library for LLMs
Obsei is a low code AI powered automation tool
A full spaCy pipeline and models for scientific/biomedical documents
A Repo For Document AI
Transformers4Rec is a flexible and efficient library
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models
A tool for learning vector representations of words and entities
Easy-to-use and powerful NLP library with Awesome model zoo
Extract schema, statistics and entities from datasets
Build AI-powered semantic search applications
An LLM-powered knowledge curation system that researches topics
The no-nonsense RAG chunking library