Sparsity-aware deep learning inference runtime for CPUs
Industrial-strength Natural Language Processing (NLP)
Han Language Processing
Unified embedding model
ExtractThinker is a Document Intelligence library for LLMs
Efficient Retrieval Augmentation and Generation Framework
Stanford NLP Python library for many human languages
Pretrained model hub for Keras 3
Large Language Model Text Generation Inference
Training data (data labeling, annotation, workflow) for all data types
The Classical Language Toolkit
ReFT: Representation Finetuning for Language Models
Transformers4Rec is a flexible and efficient library
Hub of ready-to-use datasets for ML models
Evaluation code for various unsupervised automated metrics
Toolkit for conversational AI
Bring the notion of Model-as-a-Service to life
Extract schema, statistics and entities from datasets
Data and tools for generating and inspecting OLMo pre-training data
Obsei is a low code AI powered automation tool
A full spaCy pipeline and models for scientific/biomedical documents
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models
A Heterogeneous Benchmark for Information Retrieval
The no-nonsense RAG chunking library
A Repo For Document AI