Showing 49 open source projects for "document search"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 1
    Papermerge

    Papermerge

    Open Source Document Management System for Digital Archives

    Papermerge is an open source document management system (DMS) primarily designed for archiving and retrieving your digital documents. Instead of having piles of paper documents all over your desk, office or drawers - you can quickly scan them and configure your scanner to directly upload to Papermerge DMS. Store, organize and index scanned documents in PDF, JPEG and TIFF formats.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 2
    PageIndex

    PageIndex

    Document Index for Vectorless, Reasoning-based RAG

    ...This reasoning-driven retrieval aligns more naturally with how humans explore complex texts, improving relevance and traceability, especially in professional domains like financial reports, legal contracts, and technical manuals. The project includes example notebooks, scripts for tree generation and search, and support for multiple document formats including PDF and markdown, with tools designed to preserve context and semantic boundaries.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    TagSpaces

    TagSpaces

    TagSpaces is an offline, open source, document manager with tagging

    TagSpaces is a free, no vendor lock-in, open source application for organizing, annotating and managing local files with the help of tags. It features advanced note taking functionalities and some capabilities of to-do apps. The application is available for Windows, Linux, Mac OS and Android. We provide a web clipper extension for Firefox, Edge and Chrome for easy collecting of online content in the form of local files. File and folder management - TagSpaces provides a convenient user...
    Downloads: 31 This Week
    Last Update:
    See Project
  • 4
    OWL

    OWL

    Optimized Workforce Learning for General Multi-Agent Assistance

    ...Unlike single-agent systems, it treats task completion as a collaborative workforce where agents take on specialized roles (planning, execution, analysis) and coordinate via a modular multi-agent architecture that supports flexible teamwork across domains. OWL delivers state-of-the-art performance on benchmarks like GAIA and emphasizes real-time decision-making, web automation, rich search integration, document parsing, and multi-tool workflows, making it suitable for tasks ranging from information retrieval to interactive automation.
    Downloads: 1 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 5
    bleve

    bleve

    A modern text indexing library for go

    ...Includes support for highlighting matching text within document fragments.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    KnowNote

    KnowNote

    A local-first AI knowledge base & NotebookLM alternative

    KnowNote is a local-first, open-source AI knowledge base and notebook application created as an Electron-based alternative to Google NotebookLM that emphasizes privacy, control, and simplicity. It lets users build an intelligent, searchable knowledge base from uploaded documents such as PDFs, Word files, PowerPoints, and web pages, and then interact with that content using LLM-powered chat, summarization, and reasoning tools. Unlike many NotebookLM alternatives that rely on Docker or cloud...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    Papis

    Papis

    Powerful and highly extensible command-line based document

    Papis is a powerful and highly extensible CLI document and bibliography manager. With Papis, you can search your library for books and papers, add documents and notes, import and export to and from other formats, and much much more. Papis uses a human-readable and easily hackable .yaml file to store each entry's bibliographical data. It strives to be easy to use while providing a wide range of features.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Docusaurus

    Docusaurus

    Easy to maintain open source documentation websites

    Docusaurus is a project that makes maintaining, building and deploying open source documentation websites incredibly easy. Simple to set up and start, Docusaurus allows you to save time and focus on your documentation. All you have to do is write docs and blog posts with Markdown and Docusaurus will handle the rest of the website build process. Docusaurus comes with pre-configured localization, as well as all the key pages and sections you need to get started. It’s also customizable, so...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Raglite

    Raglite

    RAGLite is a Python toolkit for Retrieval-Augmented Generation

    Raglite is a lightweight framework for building Retrieval-Augmented Generation (RAG) pipelines with minimal configuration. It connects large language models to vector databases for context-aware responses, enabling developers to prototype and deploy RAG systems quickly. Raglite focuses on simplicity and modularity for fast experimentation.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 10
    Derby

    Derby

    MVC framework making it easy to write collaborative applications

    ...Racer supports offline usage and conflict resolution out of the box, which greatly simplifies writing multi-user applications. Derby applications load immediately and can be indexed by search engines, because the same templates render on both server and client. In addition, templates define bindings, which instantly update the view when the model changes and vice versa. Derby makes it simple to write applications that load as fast as a search engine, are as interactive as a document editor, and work offline.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    ccls

    ccls

    C/C++/ObjC language server supporting cross references & hierarchies

    ...Saving files will incrementally update the index. Hierarchies, call (caller/callee) hierarchy, inheritance (base/derived) hierarchy, member hierarchy. Symbol rename. Document symbols and approximate search of workspace symbol. Hover information. Diagnostics and code actions (clang FixIts). Semantic highlighting and preprocessor skipped regions.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 12
    XTDB

    XTDB

    General-purpose bitemporal database for SQL, Datalog & graph queries

    XTDB is a general-purpose bitemporal database for SQL, Datalog & graph queries. XTDB contains a perfect, immutable record of every fact your system has ever known. See the entire history of your business, everywhere. Immutable records are incomplete without time-traveling queries. XTDB allows you to query the entire timeline. Make retroactive corrections, simplify data migrations, and get clarity on out-of-order events. It is the interconnection of facts that makes them valuable. Query the...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    OrientDB

    OrientDB

    DBMS supporting graph, document, full-text and geospatial models

    OrientDB is an Open Source Multi-Model NoSQL DBMS with the support of Native Graphs, Documents, Full-Text search, Reactivity, Geo-Spatial and Object Oriented concepts. It's written in Java and it's amazingly fast. No expensive run-time JOINs, connections are managed as persistent pointers between records. You can traverse thousands of records in no time. Supports schema-less, schema-full and schema-mixed modes. Has a strong security profiling system based on user, roles and predicate...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    PdfgrepGui

    PdfgrepGui

    This is a simple GUI for the command line tool grep and pdfgrep

    THIS PROJECT HAS MOVED TO: https://sourceforge.net/projects/documentgrep/ This program is a GUI for the command line tool grep and pdfgrep. Pdfgrep search text in multiple PDF files and grep can serach text in multiple text files. You can use regular expressions for the search (https://en.wikipedia.org/wiki/Regular_expression). This GUI and the command line tools work without indexing. The following options are used: -i (ignore case) and -F (fixed strings), -n (Print page number or...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    OmegaT - multiplatform CAT tool

    OmegaT - multiplatform CAT tool

    The free computer aided translation (CAT) tool for professionals

    OmegaT is a free and open source multiplatform Computer Assisted Translation tool with fuzzy matching, translation memory, keyword search, glossaries, and translation leveraging into updated projects.
    Leader badge
    Downloads: 2,522 This Week
    Last Update:
    See Project
  • 16

    Class Viewer for Java

    Lightweight, quick reference tool for Java developers.

    Full overview of public for a class: methods, constructors and fields, as well as its superclass and interfaces. Has free search of public methods. Can open directly to a method in JavaDocs with your preferred browser, which is set in ClassViewerConfig.xml--which can be easily edited with a text editor. Best ran from the command line. Can also go to your own code with a designated text editor--directly to a public method if your text editor supports a line number as an argument,...
    Leader badge
    Downloads: 11 This Week
    Last Update:
    See Project
  • 17
    DownSmith Markdown Editor

    DownSmith Markdown Editor

    A powerful, feature-rich Markdown editor with real-time HTML preview.

    DownSmith provides an intuitive editing experience with comprehensive formatting tools, syntax highlighting, live preview, table creation, spell checking, footnotes, HTML export, and intelligent image handling. Runs without Java being installed on Windows. On macOS and Linux requires Java 11 or better installed. A Java 8 version is provided that has all the functionality of the Java 11 version except footnotes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    cerberuscms2

    cerberuscms2

    Cerberus Content Management System

    Cerberus Content Management System is a dynamic, secure and infinitely expandable CMS designed after a Unix-Like model. It is a custom written Web Application Framework ( W.A.F. ) with a consistent and custom written Pre-Hyper-Text-Post-Processor Programming Code Framework ( P.C.F. ). This Web Application Software Project' aim is to be the fastest and most secure Web Application Framework, Web Application Programming Code Framework, Text, Voice and Video Communications Platform and Content...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    GitHub Cheat Sheet

    GitHub Cheat Sheet

    A list of cool features of Git and GitHub

    ...It collects commands, workflows, and UI shortcuts that many developers are not aware of, such as advanced uses of git log, git reflog, GitHub keyboard shortcuts, URL hacks, and useful configuration settings. The project was inspired by Zach Holman’s talks on Git and GitHub secrets and aims to turn those scattered insights into a living document. The cheat sheet is organized into sections like “GitHub Search,” “GitHub Secrets,” “GitHub Security,” “Git Tips,” and so on, so readers can focus on specific aspects of the Git/GitHub workflow. It is maintained as a Markdown README, which means it can be read directly on GitHub, printed, or incorporated into internal docs. With tens of thousands of stars, it has become a popular resource for both newcomers and experienced developers who want to level up their GitHub usage.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Alphafold

    Alphafold

    Open source code for AlphaFold

    ...The total download size for the full databases is around 415 GB and the total size when unzipped is 2.2 TB. Please make sure you have a large enough hard drive space, bandwidth and time to download. We recommend using an SSD for better genetic search performance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Common Resource Grep - crgrep

    Common Resource Grep - crgrep

    Common Resource Grep

    CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 22
    rest-hapi

    rest-hapi

    A RESTful API generator for Node.js

    Customize endpoints with configuration-based features and hapi plugins. Relational structure built into NoSQL documents based on mongoose schemas. Less time with boilerplate functionality and more time building awesome APIs! rest-hapi uses mongoose schemas to generate CRUD and association REST API endpoints on a hapi server. Think LoopBack but for hapi. We love the hapi framework and its style of modularity and configuration over code. We also love writing DRY code and leveraging tools that...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Elastic

    Elastic

    Elasticsearch client for Go

    An Elasticsearch client for the Go programming language. Elastic supports different versions of Elasticsearch. However, you must choose the version of Elastic that matches the Elasticsearch version. If you want to use stable versions of Elastic, please use Go modules for the 7.x release (or later) or a dependency manager like dep for earlier releases. Elastic has been used in production starting with Elasticsearch 0.90 up to recent 7.x versions. We recently switched to GitHub Actions for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Rank-BM25

    Rank-BM25

    A Collection of BM25 Algorithms in Python

    A collection of algorithms for querying a set of documents and returning the ones most relevant to the query. The most common use case for these algorithms is, as you might have guessed, to create search engines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Paperless-ng

    Paperless-ng

    A supercharged version of paperless, scan, index and archive docs

    Paperless is a simple Django application running in two parts, a Consumer (the thing that does the indexing) and a Web server (the part that lets you search & download already-indexed documents). Paper is a nightmare. Environmental issues aside, there’s no excuse for it in the 21st century. It takes up space, collects dust, doesn’t support any form of a search feature, indexing is tedious, it’s heavy and prone to damage & loss. I wrote this to make “going paperless” easier. I do not have to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo