Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
The no-nonsense RAG chunking library
Data and tools for generating and inspecting OLMo pre-training data
List of useful data augmentation resources
Data processing for and with foundation models
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models
Extract schema, statistics and entities from datasets
Hub of ready-to-use datasets for ML models
A Deep Neural Text Understanding Framework
Implementation of research papers on Deep Learning+ NLP+ CV in Python
A library for deep learning end-to-end dialog systems and chatbots
Sparsity-aware deep learning inference runtime for CPUs
Deep learning based natural language and speech processing platform
Language, engine, and tooling for testing composable language rules
rational agent
Explain, analyze, and visualize NLP language models
An interpretable and efficient predictor using pre-trained models
English-Khmer Automatic Statistic Machine Translation (SMT)
ExtractThinker is a Document Intelligence library for LLMs
JSON based text search Java Project