Search Results for "pdf document text search engine"

Showing 78 open source projects for "pdf document text search engine"

View related business solutions
  • Top-Rated Free CRM Software Icon
    Top-Rated Free CRM Software

    216,000+ customers in over 135 countries grow their businesses with HubSpot

    HubSpot is an AI-powered customer platform with all the software, integrations, and resources you need to connect your marketing, sales, and customer service. HubSpot's connected platform enables you to grow your business faster by focusing on what matters most: your customers.
  • Red Hat Enterprise Linux on Microsoft Azure Icon
    Red Hat Enterprise Linux on Microsoft Azure

    Deploy Red Hat Enterprise Linux on Microsoft Azure for a secure, reliable, and scalable cloud environment, fully integrated with Microsoft services.

    Red Hat Enterprise Linux (RHEL) on Microsoft Azure provides a secure, reliable, and flexible foundation for your cloud infrastructure. Red Hat Enterprise Linux on Microsoft Azure is ideal for enterprises seeking to enhance their cloud environment with seamless integration, consistent performance, and comprehensive support.
  • 1
    PaperQA2

    PaperQA2

    High accuracy RAG for answering questions from scientific documents

    PaperQA2 is a package for doing high-accuracy retrieval augmented generation (RAG) on PDFs or text files, with a focus on the scientific literature. See our recent 2024 paper to see examples of PaperQA2's superhuman performance in scientific tasks like question answering, summarization, and contradiction detection. In this example we take a folder of research paper PDFs, magically get their metadata - including citation counts and a retraction check, then parse and cache PDFs into a full-text...
    Downloads: 31 This Week
    Last Update:
    See Project
  • 2
    Papermerge

    Papermerge

    Open Source Document Management System for Digital Archives

    Papermerge is an open source document management system (DMS) primarily designed for archiving and retrieving your digital documents. Instead of having piles of paper documents all over your desk, office or drawers - you can quickly scan them and configure your scanner to directly upload to Papermerge DMS. Store, organize and index scanned documents in PDF, JPEG and TIFF formats. Instantly find relevant information using full text, tags and metadata-based search. Papermerge is free and open...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 3
    ripgrep

    ripgrep

    Regex pattern directory search tool that respects your .gitignore

    ... could be PDF text extraction, less supported decompression, decrypting, automatic encoding detection and so on. In other words, use ripgrep if you like speed, filtering by default, fewer bugs and Unicode support.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 4
    Teedy

    Teedy

    Lightweight document management system

    ...-oriented document management system, the user interface is not cluttered with buttons and menus and works both on desktop and mobile. Document searching has never been easier thanks to the powerful full-text search engine in Teddy. You can search in images (embedded OCR), DOCX, ODT, TXT, PDF, and more. Verify or validate your documents with people of your organization using workflows.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Business Continuity Solutions | ConnectWise BCDR Icon
    Business Continuity Solutions | ConnectWise BCDR

    Build a foundation for data security and disaster recovery to fit your clients’ needs no matter the budget.

    Whether natural disaster, cyberattack, or plain-old human error, data can disappear in the blink of an eye. ConnectWise BCDR (formerly Recover) delivers reliable and secure backup and disaster recovery backed by powerful automation and a 24/7 NOC to get your clients back to work in minutes, not days.
  • 5
    SILE

    SILE

    The SILE Typesetter — Simon’s Improved Layout Engine

    SILE is a typesetting system; its job is to produce beautiful printed documents. Conceptually, SILE is similar to TeX—from which it borrows some concepts and even syntax and algorithms—but the similarities end there. Rather than being a derivative of the TeX family SILE is a new typesetting and layout engine written from the ground up using modern technologies and borrowing some ideas from graphical systems such as InDesign.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    Paperless-ngx

    Paperless-ngx

    A community-supported supercharged version of paperless

    Paperless-ngx is a community-supported open-source document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    marqo

    marqo

    Tensor search for humans

    A tensor-based search and analytics engine that seamlessly integrates with your applications, websites, and workflows. Marqo is a versatile and robust search and analytics engine that can be integrated into any website or application. Due to horizontal scalability, Marqo provides lightning-fast query times, even with millions of documents. Marqo helps you configure deep-learning models like CLIP to pull semantic meaning from images. It can seamlessly handle image-to-image, image-to-text...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    ChatGPT Academic

    ChatGPT Academic

    ChatGPT extension for scientific research work

    ChatGPT extension for scientific research work, specially optimized academic paper polishing experience, supports custom shortcut buttons, supports custom function plug-ins, supports markdown table display, double display of Tex formulas, complete code display function, new local Python/C++/Go project tree Analysis function/Project source code self-translation ability, newly added PDF and Word document batch summary function/PDF paper full-text translation function. All buttons are dynamically...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    rqlite

    rqlite

    The lightweight, distributed relational database built on SQLite

    rqlite is an easy-to-use, lightweight, distributed relational database, which uses SQLite as its storage engine. rqlite is simple to deploy, operating it is very straightforward, and its clustering capabilities provide you with fault-tolerance and high availability. rqlite is available for Linux, macOS, and Microsoft Windows. rqlite gives you the functionality of a rock solid, fault-tolerant, replicated relational database, but with very easy installation, deployment, and operation...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Cybersecurity Management Software for MSPs Icon
    Cybersecurity Management Software for MSPs

    Secure your clients from cyber threats.

    Define and Deliver Comprehensive Cybersecurity Services. Security threats continue to grow, and your clients are most likely at risk. Small- to medium-sized businesses (SMBs) are targeted by 64% of all cyberattacks, and 62% of them admit lacking in-house expertise to deal with security issues. Now technology solution providers (TSPs) are a prime target. Enter ConnectWise Cybersecurity Management (formerly ConnectWise Fortify) — the advanced cybersecurity solution you need to deliver the managed detection and response protection your clients require. Whether you’re talking to prospects or clients, we provide you with the right insights and data to support your cybersecurity conversation. From client-facing reports to technical guidance, we reduce the noise by guiding you through what’s really needed to demonstrate the value of enhanced strategy.
  • 10

    Laila.Pdf

    A .NET6 WPF Pdfium-based viewer control and printer object.

    A .NET6 Pdfium-based viewer featuring smooth scrolling, text selecting and copying, search and basic PDF forms support and a .NET6 PDF printer. Written using an extended version of PDFiumSharp to which I added PDF forms support. Installation 1) Get the package from NuGet. 2) Get the package PDFium.Windows (Windows 7 compatible) or PDFium.WindowsV2 (PDF forms support) from NuGet. 3) Place the control on your WPF form. 4) Set or bind the Document property to the bytes...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    AnyTXT Searcher

    AnyTXT Searcher

    A Powerful Desktop Full-Text Search Engine, Just Like Local Google.

    AnyTXT Searcher is a powerful file full-text search engine, a desktop search application for fast document retrieval. Just like a local disk Google search engine, much faster than Windows Search, it is your ideal desktop file content full-text search engine. It has a powerful document parsing engine built in, which extracts the text of commonly used file formats without installing any other software, and combines the built-in high-speed indexing system to store the metadata of the text...
    Leader badge
    Downloads: 2,150 This Week
    Last Update:
    See Project
  • 12
    OmegaT - multiplatform CAT tool

    OmegaT - multiplatform CAT tool

    The free computer aided translation (CAT) tool for professionals

    OmegaT is a free and open source multiplatform Computer Assisted Translation tool with fuzzy matching, translation memory, keyword search, glossaries, and translation leveraging into updated projects.
    Leader badge
    Downloads: 1,863 This Week
    Last Update:
    See Project
  • 13
    Bible SuperSearch

    Bible SuperSearch

    Web-Based Bible Search Engine

    Advanced web-based Bible passage lookup and search script. Search entire Bible or only within the specified reference(s). Keeps users on your website and does not send them to our website. Install our API and run completely on your website! Full Boolean search with parentheses, proximity search. Look up multiple passages, parallel Bibles. Bible downloads also available: * PDF * Plain Text * MySQL dumps * JSON * SQLite https://www.biblesupersearch.com http...
    Leader badge
    Downloads: 25 This Week
    Last Update:
    See Project
  • 14
    Kiwix

    Kiwix

    Wikipedia offline & more

    Kiwix is an offline reader for Web content. It's especially intended to make Wikipedia available offline. With Kiwix, you can enjoy Wikipedia on a boat, in the middle of nowhere... or in Jail. Kiwix manages to do that by reading ZIM files, a highly compressed open format with additional meta-data.
    Downloads: 39 This Week
    Last Update:
    See Project
  • 15
    FastReport Open Source

    FastReport Open Source

    Free Open Source Reporting tool for .NET

    Free Open Source Reporting tool for .NET Core/.NET Framework that helps your application generate document-like reports.
    Downloads: 34 This Week
    Last Update:
    See Project
  • 16
    PdfgrepGui

    PdfgrepGui

    This is a simple GUI for the command line tool grep and pdfgrep

    This program is a GUI for the command line tool grep and pdfgrep. Pdfgrep search text in multiple PDF files and grep can serach text in multiple text files. You can use regular expressions for the search (https://en.wikipedia.org/wiki/Regular_expression). This GUI and the command line tools work without indexing. The following options are used: -i (ignore case) and -F (fixed strings), -n (Print page number or output lines) and -H (Print the file name for each match) from the command line...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 17
    BlueSpice free (Support archive)

    BlueSpice free (Support archive)

    Our support forum has moved: community.bluespice.com

    This freely available open-source software turns Wikipedia’s popular software engine MediaWiki into a fully-fledged enterprise wiki solution. Companies can continue cherishing MediaWiki’s numerous advantages and automation capabilities; with BlueSpice, they can now work even more comfortably, safely and more effectively. Compared with basic MediaWiki, BlueSpice provides, amongst other, the following enhancements: comfortable and sophisticated rights management capabilities, a visual editor...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 18
    TextSeek

    TextSeek

    Professional full-text desktop search tool

    TextSeek is a professional full-text desktop search tool. Unlike the filename search tool like Everything and Listary, TextSeek can search filename and file content easily and quickly. It supports PDF, Word, Excel, Powerpoint, RTF and other formats. The software can run directly, and no extra package is required to install.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    DocSearcher
    DocSearcher is a search tool for indexing and searching files on a personal computer. It uses API's to provide search functionality for common document formats. currently: Word, Excel, PDF, Libre/Open/StarOffice, RTF, Text, and HTML
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    Common Resource Grep - crgrep

    Common Resource Grep - crgrep

    Common Resource Grep

    CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    Delphi : VRCalc++ OOSL (Script) and more

    Delphi : VRCalc++ OOSL (Script) and more

    Delphi : VRCalc++ OOSL & + (Paged List, TextEditor, VRAstroVision ...)

    Vincent Radio {Adrix.NT} Sources Library & Applications : Delphi C++ Java VRCalc++ C# VRCalc++ Object Oriented Scripting Language - Engine Source Pascal Code - Delphi Packages Build Prjs - VRCalc++ Scripted System Std RT Library - Guides & Docs (CHM, PDF, DOCX) - VCL & FMX (FireMonkey) Support - Script Test Code (Lang RTL VCL FMX) - Visual Stage Project : VCL & FMX Paged Lists & Iterators : Delphi C++ Java C# Multi-Dim Arrays & Direct Graph Classes : Delphi C++ Java VRCalc++ C...
    Leader badge
    Downloads: 6 This Week
    Last Update:
    See Project
  • 22
    FUDforum
    FUDforum is a super fast and scalable discussion forum. It is highly customizable and supports unlimited members, forums, posts, topics, polls & attachments. It can import XML Feeds and sync with USENET groups and Mailing Lists (bi-directional).
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23
    Delphi : VRCalc++ and more Binary Exec

    Delphi : VRCalc++ and more Binary Exec

    Delphi Java - VRCalc++ OOSL (Script) and + (Binary Exec Distro)

    Vincent Radio {Adrix.NT} Embarcadero : Delphi : Executable Binaries Delphi : VRCalc++ Object Oriented Scripting Language : Engine + Ext Libraries VRCalc++ OOSL Visual Stage Project : VCL & FMX (FireMonkey) VRCalc++ Script Executor: - VCL Console - Terminal Console - FMX Console + VRCalc++ OOSL : VR System Scripted Standard Runtime Library Delphi Applics - VR Multi Editor : Smart Text Editor - VR Lazy Code Editor : Smart RTF Multi Lang Code Text Editor - VR Astro Vision...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24

    uweb browser: unlimited power

    minimal suckless android web browser with unlimited power

    - Powerful: html5 enhancement; any urls to host a website; javascript and shell scripting for general processing; and more with Termux. - Customizable: user-defined menus, (new) buttons and gestures for user agents, bookmarklets, url services, shell commands, internal functionality links and text processing etc. - Convenient: Any book (pdf/djvu)/dictionary (mdict)/txt/command line/app/webapp (web extensions) can be search engine. - Tiny: less than 200k - Fast: run fast, even with thousands...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    Paperless-ng

    Paperless-ng

    A supercharged version of paperless, scan, index and archive docs

    Paperless is a simple Django application running in two parts, a Consumer (the thing that does the indexing) and a Web server (the part that lets you search & download already-indexed documents). Paper is a nightmare. Environmental issues aside, there’s no excuse for it in the 21st century. It takes up space, collects dust, doesn’t support any form of a search feature, indexing is tedious, it’s heavy and prone to damage & loss. I wrote this to make “going paperless” easier. I do not have...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next