Search Results for "document tracking system"

Sort By:

Showing 256 open source projects for "document tracking system"

View related business solutions

Python Clear Filters & Widen Search

Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
Fully Managed MySQL, PostgreSQL, and SQL Server
Automatic backups, patching, replication, and failover. Focus on your app, not your database.

Cloud SQL handles your database ops end to end, so you can focus on your app.

Try Free
1

WeKnora

LLM framework for document understanding and semantic retrieval

...This approach enables the system to provide more reliable answers by grounding model reasoning in the content of uploaded documents. WeKnora is designed with a modular architecture that separates components for document processing, search strategies, and model inference, allowing developers to customize or extend different parts of the pipeline. It supports knowledge base management and conversational question answering built on top of structured and unstructured documents.

Downloads: 6 This Week

Last Update: 2026-05-01
See Project
2

Paperless-AI

AI-powered document analysis and tagging for Paperless-ngx

Paperless-AI is an AI-powered extension designed to enhance document management within Paperless-ngx by automating analysis, classification, and organization tasks. It continuously monitors incoming documents and processes them using various AI backends, enabling automatic assignment of titles, tags, document types, and correspondents. It integrates with multiple OpenAI-compatible services as well as local models, giving users flexibility in how document intelligence is handled. A key...

Downloads: 7 This Week

Last Update: 2026-03-17
See Project
3

docext

An on-premises, OCR-free unstructured data extraction

docext is a document intelligence toolkit that uses vision-language models to extract structured information from documents such as PDFs, forms, and scanned images. The system is designed to operate entirely on-premises, allowing organizations to process sensitive documents without relying on external cloud services. Unlike traditional document processing pipelines that rely heavily on optical character recognition, docext leverages multimodal AI models capable of understanding both visual and textual information directly from document images. ...

Downloads: 3 This Week

Last Update: 2026-03-12
See Project
4

Papermerge

Open Source Document Management System for Digital Archives

Papermerge is an open source document management system (DMS) primarily designed for archiving and retrieving your digital documents. Instead of having piles of paper documents all over your desk, office or drawers - you can quickly scan them and configure your scanner to directly upload to Papermerge DMS. Store, organize and index scanned documents in PDF, JPEG and TIFF formats.

Downloads: 17 This Week

Last Update: 2025-07-24
See Project
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
5

WiFi DensePose

Turn WiFi signals into real-time human pose estimation and detection

WiFi DensePose is a production-oriented implementation of a WiFi-based human pose estimation system that enables real-time full-body tracking using wireless signals rather than cameras. The project demonstrates how commodity mesh routers and signal processing techniques can be leveraged to infer dense human pose information, even through obstacles such as walls. It is designed to showcase the emerging field of RF-based sensing, where machine learning models interpret wireless channel data to reconstruct human movement and posture. ...

Downloads: 164 This Week

Last Update: 3 days ago
See Project
6

BoxMOT

Pluggable SOTA multi-object tracking modules for segmentation

...The framework supports integration with detection, segmentation, and pose estimation models that produce bounding box outputs. It also includes evaluation tools and benchmarking pipelines that allow researchers to test tracking performance on standard datasets such as MOT17 and MOT20. The system offers different performance modes that balance computational efficiency with tracking accuracy depending on the application requirements.

Downloads: 1 This Week

Last Update: 2026-04-22
See Project
7

Hallucination Leaderboard

Leaderboard Comparing LLM Performance at Producing Hallucinations

Hallucination Leaderboard is an open research project that tracks and compares the tendency of large language models to produce hallucinated or inaccurate information when generating summaries. The project provides a standardized benchmark that evaluates different models using a dedicated hallucination detection system known as the Hallucination Evaluation Model. Each model is tested on document summarization tasks to measure how often generated responses introduce information that is not supported by the original source material. The results are published as a leaderboard that allows researchers and developers to compare model reliability and factual consistency. ...

Downloads: 1 This Week

Last Update: 2026-04-29
See Project
8

Krixik

Documentation for the Krixik Python client

Small/specialized AI models are an oft-necessary complement—or alternative—to "big AI" offerings. However, infrastructure for small AI tends to be underwhelming, so building with specialized AI can be difficult, time-consuming, and even expensive. Iterating with different models, and particularly with different combinations of these models, can thus be rendered unfeasible.

Downloads: 0 This Week

Last Update: 2024-11-05
See Project
9

text-extract-api

Document (PDF, Word, PPTX ...) extraction and parse API

text-extract-api is an open-source service designed to extract readable text from a wide variety of document formats through a simple API interface. The project focuses on converting complex files such as PDFs, images, scanned documents, and office files into structured plain text that can be processed by downstream applications or language models. Instead of requiring developers to integrate multiple document parsing libraries individually, the system centralizes text extraction capabilities into a unified API that standardizes the output. ...

Downloads: 1 This Week

Last Update: 2026-03-05
See Project
Enterprise-grade ITSM, for every business
Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.

Try it Free
10

OpenLIT

OpenLIT is an open-source LLM Observability tool

OpenLIT is an OpenTelemetry-native tool designed to help developers gain insights into the performance of their LLM applications in production. It automatically collects LLM input and output metadata and monitors GPU performance for self-hosted LLMs. OpenLIT makes integrating observability into GenAI projects effortless with just a single line of code. Whether you're working with popular LLM providers such as OpenAI and HuggingFace, or leveraging vector databases like ChromaDB, OpenLIT...

Downloads: 5 This Week

Last Update: 2 days ago
See Project
11

RAG Anything

RAG-Anything: All-in-One RAG Framework

...The system uses a multi-stage pipeline (e.g., document parsing, content analysis, knowledge graph construction, intelligent retrieval) so queries can navigate across modalities with deeper understanding and relevance.

Downloads: 4 This Week

Last Update: 5 days ago
See Project
12

h2oGPT

Private chat with local GPT with document, images, video, etc.

h2oGPT is an open-source platform that allows users to interact with local GPT models in a completely private environment. It supports a variety of document types, including PDFs, Word files, images, video frames, and even audio, enabling users to query and analyze their documents or engage in a private chat with AI. The platform is designed to be secure and offline, ensuring that all data remains private and under the user's control. h2oGPT supports several AI models, including oLLaMa and...

Downloads: 2 This Week

Last Update: 2025-02-22
See Project
13

SwanLab

An open-source, modern-design AI training tracking and visualization

SwanLab is an open-source experiment tracking and visualization platform designed to help machine learning engineers monitor, compare, and analyze the training of artificial intelligence models. The tool records training metrics, hyperparameters, model outputs, and experiment configurations so that developers can easily understand how different experiments perform over time.

Downloads: 7 This Week

Last Update: 5 days ago
See Project
14

Agent SOP

Natural language workflows for AI agents

...It defines reusable SOP templates that agents can instantiate with context-specific parameters, allowing organizations to codify best practices for customer support, data processing, document workflows, or incident response. The framework supports monitoring and state tracking, so external systems can observe progress, intervene if necessary, and log outcomes for compliance or auditing. Integrations with common messaging and task orchestration systems enable SOP agents to interact with email, ticket queues, and databases as part of their workflows.

Downloads: 4 This Week

Last Update: 2026-04-10
See Project
15

owllook

Vertical novel search engine with unified reading and tracking tools

...Owllook also includes functionality for tracking reading history, displaying rankings based on search activity, and recommending books using a similarity-based approach. Owllook is built using asynchronous technologies to support efficient data retrieval and responsive interactions while reading or searching.

Downloads: 5 This Week

Last Update: 3 days ago
See Project
16

Salt

Automate the management and configuration of infrastructures at scale

...What systems and infrastructure can be managed by a Salt Minion? Salt runs on and manages many versions of Linux, Windows, Mac OS X and UNIX. The Salt Supported Operating System document defines the specific operating systems that are fully supported and outlines the package creation policy for each operating system listed. The document also outlines the best-effort support policy for additional operating systems. Salt Bootstrap is a shell script that detects the target platform and selects the best installation method.

Downloads: 41 This Week

Last Update: 4 days ago
See Project
17

Sparrow

Structured data extraction and instruction calling with ML, LLM

Sparrow is an open-source platform designed to extract structured information from documents, images, and other unstructured data sources using machine learning and large language models. The system focuses on transforming complex documents such as invoices, receipts, forms, and scanned pages into structured formats like JSON that can be processed by downstream applications. It combines several components, including OCR pipelines, vision-language models, and LLM-based reasoning modules to identify and extract meaningful data fields from heterogeneous document layouts. ...

Downloads: 4 This Week

Last Update: 2026-03-04
See Project
18

Data Version Control

Git-based data version control for machine learning workflows

...Instead of storing large datasets directly in Git, DVC keeps lightweight metadata in the repository while storing the actual data in external storage systems. This approach allows teams to manage large files efficiently while maintaining a clear history of changes to data and models. DVC also provides a pipeline system that defines the stages of machine learning workflows, making experiments reproducible and easier to manage. By tracking dependencies between code, data, and parameters, the system ensures that only the necessary stages are re-run when changes occur. DVC also includes experiment tracking capabilities that allow users to compare different training runs.

Downloads: 6 This Week

Last Update: 2026-03-31
See Project
19

DocETL

A system for agentic LLM-powered data processing and ETL

DocETL is an open-source system designed to build and execute data processing pipelines powered by large language models, particularly for analyzing complex collections of documents and unstructured datasets. The platform allows developers and researchers to construct structured workflows that extract, transform, and organize information from sources such as reports, transcripts, legal documents, and other text-heavy data.

Downloads: 5 This Week

Last Update: 2026-03-05
See Project
20

dots.ocr

Multilingual Document Layout Parsing in a Single Vision-Language Model

dots.ocr is a cutting-edge multilingual document parsing system built on a unified vision-language model that combines layout detection, text recognition, and structural understanding into a single architecture. Unlike traditional OCR pipelines that rely on multiple specialized components, dots.ocr integrates these processes end-to-end, reducing error propagation and improving consistency across tasks.

Downloads: 0 This Week

Last Update: 2026-03-24
See Project
21

ShredOS

ShredOS Disk Eraser 64 bit for all Intel 64 bit processors

ShredOS is a lightweight, bootable Linux-based operating system designed specifically for secure disk erasure and data destruction. It enables users to permanently wipe hard drives, SSDs, and NVMe devices using the powerful nwipe utility and multiple industry-recognized wiping methods. Compatible with both BIOS and UEFI systems, ShredOS supports PCs, servers, and Intel-based Macs running on 32-bit and 64-bit processors. The platform can erase multiple drives simultaneously while generating...

Downloads: 441 This Week

Last Update: 2026-04-02
See Project
22

Frappe

Low code web framework for real world applications

Frappe is a full-stack, low-code web framework written in Python and JavaScript, used to build scalable and modular enterprise applications. It powers ERPNext and includes tools for REST APIs, user management, document modeling, workflows, and real-time updates. Frappe uses a "model-view-controller" approach with its own ORM and frontend system, enabling rapid development without sacrificing control or performance.

Downloads: 7 This Week

Last Update: 4 days ago
See Project
23

Semantra

Multi-tool for semantic search

...The software analyzes text and PDF documents stored locally and creates embeddings that allow queries to retrieve results based on conceptual similarity. It is primarily intended for individuals who need to extract insights from large document collections, including researchers, journalists, students, and historians. The system runs from the command line and automatically launches a local web interface where users can perform interactive searches and examine document passages related to a query. By relying on semantic embeddings and contextual analysis, the tool can identify passages that are relevant even when the query uses different wording than the source documents.

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
24

NeMo Retriever Library

Document content and metadata extraction microservice

NeMo Retriever Library is a scalable microservice framework designed for extracting, structuring, and enriching content from documents to support downstream generative AI applications. It processes various document types by splitting them into components such as text, tables, charts, and images, and then applies OCR and contextual analysis to convert them into structured data formats. The system is built on NVIDIA NIM microservices, enabling high-performance parallel processing and efficient handling of large datasets. It supports multiple extraction strategies for different document formats, balancing accuracy and throughput depending on the use case. ...

Downloads: 0 This Week

Last Update: 2026-03-18
See Project
25

WeebCentral Downloader

A powerful manga downloader for WeebCentral with both GUI and CLI

...It emphasizes performance through multi-threaded downloading, allowing multiple chapters and images to be retrieved simultaneously for faster completion. The software includes a visually distinctive GUI built with PyQt6, featuring a modern design system and interactive components for managing downloads and viewing manga information. Users can select specific chapters, adjust download speed, and configure output formats such as PDF or CBZ, making it adaptable to different reading preferences. The tool also incorporates progress tracking and background worker threads to ensure a responsive experience during large downloads. ...

Downloads: 36 This Week

Last Update: 2026-03-24
See Project