Showing 92 open source projects for "document search engine"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Financial reporting cloud-based software. Icon
    Financial reporting cloud-based software.

    For companies looking to automate their consolidation and financial statement function

    The software is cloud based and automates complexities around consolidating and reporting for groups with multiple year ends, currencies and ERP systems with a slice and dice approach to reporting. While retaining the structure, control and validation needed in a financial reporting tool, we’ve managed to keep things flexible.
    Learn More
  • 1
    Search-Index

    Search-Index

    A persistent, network resilient, full text search library

    Search-Index is a lightweight and fast JavaScript-based search engine that enables full-text search indexing and retrieval for web applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    marqo

    marqo

    Tensor search for humans

    A tensor-based search and analytics engine that seamlessly integrates with your applications, websites, and workflows. Marqo is a versatile and robust search and analytics engine that can be integrated into any website or application. Due to horizontal scalability, Marqo provides lightning-fast query times, even with millions of documents. Marqo helps you configure deep-learning models like CLIP to pull semantic meaning from images.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Docspell

    Docspell

    Assist in organizing your piles of documents

    Docspell is a personal document organizer. Or sometimes called a "Document Management System" (DMS). You'll need a scanner to convert your papers into files. Docspell can then assist in organizing the resulting mess. It can unify your files from scanners, emails, and other sources. It is targeted for home use, i.e. families, households, and also for smaller groups/companies.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    AnyTXT Searcher

    AnyTXT Searcher

    A Powerful Desktop Full-Text Search Engine, Just Like Local Google.

    AnyTXT Searcher is a powerful file full-text search engine, a desktop search application for fast document retrieval. Just like a local disk Google search engine, much faster than Windows Search, it is your ideal desktop file content full-text search engine. It has a powerful document parsing engine built in, which extracts the text of commonly used file formats without installing any other software, and combines the built-in high-speed indexing system to store the metadata of the text. ...
    Leader badge
    Downloads: 5,300 This Week
    Last Update:
    See Project
  • Reach Your Audience with Rise Vision, the #1 Cloud Digital Signage Software Solution Icon
    Reach Your Audience with Rise Vision, the #1 Cloud Digital Signage Software Solution

    K-12 Schools, Higher Education, Businesses, Restaurants

    Rise Vision is the #1 digital signage company, offering easy-to-use cloud digital signage software compatible with any player across multiple screens. Forget about static displays. Save time and boost sales with 500+ customizable content templates for your screens. If you ever need help, get free training and exceptionally fast support.
    Learn More
  • 5
    Qdrant

    Qdrant

    Vector Database for the next generation of AI applications

    Qdrant is a vector similarity engine & vector database. It deploys as an API service providing search for the nearest high-dimensional vectors. With Qdrant, embeddings or neural network encoders can be turned into full-fledged applications for matching, searching, recommending, and much more! Provides the OpenAPI v3 specification to generate a client library in almost any programming language.
    Downloads: 84 This Week
    Last Update:
    See Project
  • 6
    Weaviate

    Weaviate

    Weaviate is a cloud-native, modular, real-time vector search engine

    Weaviate in a nutshell: Weaviate is a vector search engine and vector database. Weaviate uses machine learning to vectorize and store data, and to find answers to natural language queries. With Weaviate you can also bring your custom ML models to production scale. Weaviate in detail: Weaviate is a low-latency vector search engine with out-of-the-box support for different media types (text, images, etc.).
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    Papermerge

    Papermerge

    Open Source Document Management System for Digital Archives

    Papermerge is an open source document management system (DMS) primarily designed for archiving and retrieving your digital documents. Instead of having piles of paper documents all over your desk, office or drawers - you can quickly scan them and configure your scanner to directly upload to Papermerge DMS. Store, organize and index scanned documents in PDF, JPEG and TIFF formats.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    Vespa

    Vespa

    The open big data serving engine

    Make AI-driven decisions using your data, in real-time. At any scale, with unbeatable performance. Vespa is a full-featured text search engine and supports both regular text search and fast approximate vector search (ANN). This makes it easy to create high-performing search applications at any scale, whether you want to use traditional techniques or a modern vector-based approach. You can even combine both approaches efficiently in the same query, something no other engine can do. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    RAGFlow

    RAGFlow

    RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine

    RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding. It offers a streamlined RAG workflow for businesses of any scale, combining LLM (Large Language Models) to provide truthful question-answering capabilities, backed by well-founded citations from various complex formatted data.
    Downloads: 5 This Week
    Last Update:
    See Project
  • DAT Freight and Analytics - DAT Icon
    DAT Freight and Analytics - DAT

    DAT Freight and Analytics operates DAT One truckload freight marketplace

    DAT Freight & Analytics operates DAT One, North America’s largest truckload freight marketplace; DAT iQ, the industry’s leading freight data analytics service; and Trucker Tools, the leader in load visibility. Shippers, transportation brokers, carriers, news organizations, and industry analysts rely on DAT for market trends and data insights, informed by nearly 700,000 daily load posts and a database exceeding $1 trillion in freight market transactions. Founded in 1978, DAT is a business unit of Roper Technologies (Nasdaq: ROP), a constituent of the Nasdaq 100, S&P 500, and Fortune 1000. Headquartered in Beaverton, Ore., DAT continues to set the standard for innovation in the trucking and logistics industry.
    Learn More
  • 10
    Perplexica

    Perplexica

    Perplexica is an AI-powered search engine

    Perplexica is an open-source AI-powered searching tool or an AI-powered search engine that goes deep into the internet to find answers. Inspired by Perplexity AI, it's an open-source option that not just searches the web but understands your questions. It uses advanced machine learning algorithms like similarity searching and embeddings to refine results and provides clear answers with sources cited. Using SearxNG to stay current and fully open source, Perplexica ensures you always get the most up-to-date information without compromising your privacy.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    MCP Server Qdrant

    MCP Server Qdrant

    An official Qdrant Model Context Protocol (MCP) server implementation

    The Qdrant MCP Server is an official Model Context Protocol server that integrates with the Qdrant vector search engine. It acts as a semantic memory layer, allowing for the storage and retrieval of vector-based data, enhancing the capabilities of AI applications requiring semantic search functionalities. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    MindSearch

    MindSearch

    An LLM-based Multi-agent Framework of Web Search Engine

    MindSearch is an AI-powered search engine based on large language models (LLMs) designed for deep semantic search and retrieval. It leverages InternLM's language model to understand complex queries and retrieve highly relevant answers from large datasets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Vald

    Vald

    Vald. A Highly Scalable Distributed Vector Search Engine

    Vald is a highly scalable distributed fast approximate nearest neighbor dense vector search engine. Vald is designed and implemented based on the Cloud-Native architecture. It uses the fastest ANN Algorithm NGT to search for neighbors. Vald has automatic vector indexing and index backup, and horizontal scaling which is made for searching from billions of feature vector data. Vald is easy to use, feature-rich and highly customizable as you needed.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Cherche

    Cherche

    Neural Search

    Cherche allows the creation of efficient neural search pipelines using retrievers and pre-trained language models as rankers. Cherche's main strength is its ability to build diverse and end-to-end pipelines from lexical matching, semantic matching, and collaborative filtering-based models. Cherche provides modules dedicated to summarization and question answering. These modules are compatible with Hugging Face's pre-trained models and fully integrated into neural search pipelines. Search is...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Elastiknn

    Elastiknn

    Elasticsearch plugin for nearest neighbor search

    Elasticsearch plugin for nearest neighbor search. Store vectors and run similarity searches using exact and approximate algorithms. Methods like word2vec and convolutional neural nets can convert many data modalities (text, images, users, items, etc.) into numerical vectors, such that pairwise distance computations on the vectors correspond to semantic similarity of the original data. Elasticsearch is a ubiquitous search solution, but its support for vectors is limited. This plugin fills the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Elasticsearch MCP Server

    Elasticsearch MCP Server

    A Model Context Protocol (MCP) server implementation

    This MCP server implementation provides interaction capabilities with Elasticsearch and OpenSearch, enabling functionalities such as document searching, index analysis, and cluster management through a set of tools. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    gse

    gse

    Go efficient multilingual NLP and text segmentation

    Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others. Gse is implements jieba by golang, and try add NLP support and more feature. Support common, search engine, full mode, precise mode and HMM mode multiple word segmentation modes. Support user and embed dictionary, Part-of-speech/POS tagging, analyze segment info, stop and trim words. Support multilingual: English, Chinese, Japanese and others. Support Traditional Chinese. Support HMM cut text use Viterbi algorithm. Support NLP by TensorFlow (in work). ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 18
    bloop

    bloop

    bloop is a fast code search engine written in Rust

    Bloop is an AI-powered code search tool designed to help developers quickly find relevant code snippets, documentation, and usage examples within large repositories. It provides natural language search capabilities and AI-enhanced recommendations for improving code discovery.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    PaperAI

    PaperAI

    Semantic search and workflows for medical/scientific papers

    PaperAI is an open-source framework for searching and analyzing scientific papers, particularly useful for researchers looking to extract insights from large-scale document collections.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Paperless-ngx

    Paperless-ngx

    A community-supported supercharged version of paperless

    Paperless-ngx is a community-supported open-source document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper.
    Downloads: 20 This Week
    Last Update:
    See Project
  • 21
    abogen

    abogen

    Generate audiobooks from EPUBs, PDFs and text with captions

    abogen is a tool designed to generate audiobooks (or speech narrations) from textual sources such as EPUBs, PDFs, or plain text, with synchronized captions. In other words, it automates the pipeline of reading a digital book (or document), converting its text into speech via a TTS engine, and packaging the result into an audiobook format — likely along with timestamped captions or subtitles that align with the spoken audio. This can be very useful for accessibility, content consumption on the go, or for users who prefer audio over reading. The repository supports handling common ebook formats and generating outputs that combine audio plus caption metadata. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 22
    vLLM

    vLLM

    A high-throughput and memory-efficient inference and serving engine

    vLLM is a fast and easy-to-use library for LLM inference and serving. High-throughput serving with various decoding algorithms, including parallel sampling, beam search, and more.
    Downloads: 34 This Week
    Last Update:
    See Project
  • 23
    VectorDB

    VectorDB

    A Python vector database you just need, no more, no less

    ...It's a testament to effective Pythonic design without over-engineering, making it a lean yet powerful solution for all your needs. vectordb capitalizes on the powerful retrieval prowess of DocArray and the scalability, reliability, and serving capabilities of Jina. Here's the magic: DocArray serves as the engine driving vector search logic, while Jina guarantees efficient and scalable index serving. This synergy culminates in a robust, yet user-friendly vector database experience, that's vectordb for you.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Krixik

    Krixik

    Documentation for the Krixik Python client

    Small/specialized AI models are an oft-necessary complement—or alternative—to "big AI" offerings. However, infrastructure for small AI tends to be underwhelming, so building with specialized AI can be difficult, time-consuming, and even expensive. Iterating with different models, and particularly with different combinations of these models, can thus be rendered unfeasible.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    huggingface_hub

    huggingface_hub

    The official Python client for the Huggingface Hub

    The huggingface_hub library allows you to interact with the Hugging Face Hub, a platform democratizing open-source Machine Learning for creators and collaborators. Discover pre-trained models and datasets for your projects or play with the thousands of machine-learning apps hosted on the Hub. You can also create and share your own models, datasets, and demos with the community. The huggingface_hub library provides a simple way to do all these things with Python.
    Downloads: 5 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next