An on-premises, OCR-free unstructured data extraction
RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine
A community-supported supercharged version of paperless
Multi-tool for semantic search
Library for OCR-related tasks powered by Deep Learning
The official Python client for the Huggingface Hub
Ready-to-use OCR with 80+ supported languages
Topic Modelling for Humans
Explainability and Interpretability to Develop Reliable ML models
Solve end to end problems using Llama model family
A very simple framework for state-of-the-art NLP
Running large language models on a single GPU
Python implementation of TextRank algorithms
An Open Toolkit for Knowledge Graph Extraction and Construction
A Unified Toolkit for Deep Learning Based Document Image Analysis
CPU/GPU inference server for Hugging Face transformer models
Repository to track the progress in Natural Language Processing (NLP)
Facilitating the design, comparison and sharing of deep text models
AiLearning, data analysis plus machine learning practice
DSTK - DataScience ToolKit for All of Us
Beautiful visualizations of how language differs among document types
A technical report on convolution arithmetic in deep learning
A machine learning system for supervised document classification