Underthesea - Vietnamese NLP Toolkit
The most accurate natural language detection library for Python
Extract schema, statistics and entities from datasets
Stanford NLP Python library for many human languages
A coding-free framework built on PyTorch
Code repo for "WebArena to build Autonomous Agents
Easy-to-use and high-performance NLP and LLM framework
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models
A tool for learning vector representations of words and entities
Trained models & code to predict toxic comments
Data and tools for generating and inspecting OLMo pre-training data
Fast and customizable framework for automatic ML model creation
Efficient Retrieval Augmentation and Generation Framework
A Heterogeneous Benchmark for Information Retrieval
A full spaCy pipeline and models for scientific/biomedical documents
Libraries for applying sparsification recipes to neural networks
The no-nonsense RAG chunking library
An easy-to-use LLMs quantization package with user-friendly apis
An LLM-powered knowledge curation system that researches topics
ReFT: Representation Finetuning for Language Models
Neural Network Compression Framework for enhanced OpenVINO
Openai style api for open large language models
Large Language Model Text Generation Inference
Data loaders and abstractions for text and NLP
Transformers4Rec is a flexible and efficient library