Showing 32 open source projects for "pdf indexing"

View related business solutions
  • Pimberly PIM - the leading enterprise Product Information Management platform. Icon
    Pimberly PIM - the leading enterprise Product Information Management platform.

    Pimberly enables businesses to create amazing online experiences with richer, differentiated product descriptions.

    Drive amazing product experiences with quality product data.
  • eLearning Solutions For Your Workplace Icon
    eLearning Solutions For Your Workplace

    eloomi is an eLearning solution for your workplace to train, retain and engage employees.

    eloomi combines Learning Management (LMS), Onboarding, Authoring, and continuous Performance Management tools in a cloud-based solution. It allows companies to optimize skill training, onboarding and employee development with strong user experience to enhance productivity and employee satisfaction. As a white label solution, the platform can be customized to mirror a company's branding and logo.
  • 1
    pdf-extractor

    pdf-extractor

    Node.js module for rendering pdf pages to images, svgs and HTML files

    ... is extracted to a text file for different usages (e.g. indexing the text). This library is in it's most basic form a node.js wrapper for pdf.js. It has default renderers to generate a default output, but is easily extended to incorporate custom logic or to generate different output. It uses a node.js DOM and the node domstub from pdf.js do make pdf parsing available on node.js without a browser.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    DB-GPT

    DB-GPT

    Revolutionizing Database Interactions with Private LLM Technology

    DB-GPT is an experimental open-source project that uses localized GPT large models to interact with your data and environment. With this solution, you can be assured that there is no risk of data leakage, and your data is 100% private and secure.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    AnyTXT Searcher

    AnyTXT Searcher

    A Powerful Desktop Full-Text Search Engine, Just Like Local Google.

    AnyTXT Searcher is a powerful file full-text search engine, a desktop search application for fast document retrieval. Just like a local disk Google search engine, much faster than Windows Search, it is your ideal desktop file content full-text search engine. It has a powerful document parsing engine built in, which extracts the text of commonly used file formats without installing any other software, and combines the built-in high-speed indexing system to store the metadata of the text...
    Leader badge
    Downloads: 2,493 This Week
    Last Update:
    See Project
  • 4
    OpenKM Document Management - DMS

    OpenKM Document Management - DMS

    Document Management System and Content Management System

    ... technological architecture design, OpenKM meets the document management needs of businesses of all sizes (from SMEs to big corporations). Thanks to its elegant and intuitive interface, OpenKM transforms complex operations into easy tasks. The most relevant functions of OpenKM is the indexing of the most common types of files: text, Office, Office 2007, OpenOffice, PDF, HTML, XML, MP3, JPEG, etc. For a complete feature list take a look at http://goo.gl/au8cQy
    Leader badge
    Downloads: 803 This Week
    Last Update:
    See Project
  • Manage your IT department more effectively Icon
    Manage your IT department more effectively

    Streamline your business from end to end with ConnectWise PSA

    ConnectWise PSA (formerly Manage) allows you to stop working in separate systems, and helps you build a more profitable business. No more duplicate data entries, inefficient employees, manual invoices, and the inability to accurately track client service issues. Get a behind the scenes look into the award-winning PSA that automates processes for each area of business: sales, help desk, support, finance, and HR.
  • 5
    myFilterWheel ASCOM DIY

    myFilterWheel ASCOM DIY

    Modify a manual filterwheel and add stepper motor and Arduino

    A project by Clive Stachon, Pete I, Paul P and Robert Brown in modifying a manual 5 slot filter wheel to automatic using an Arduino Nano and stepper motor. Windows application, ASCOM driver and Arduino firmware provided. Updated, reflecting new PDF and firmware and applications based on contributions from Pete. Project supports 4, 5, 7 and 9 slot filterwheels.
    Leader badge
    Downloads: 29 This Week
    Last Update:
    See Project
  • 6
    Hypernomicon

    Hypernomicon

    Hypertext-infused philosophy personal database software

    Hypernomicon is a personal productivity/database application for researchers that combines structured note-taking, mind-mapping, management of files (e.g., PDFs) and folders, and reference management into an integrated environment that organizes all of the above into semantic networks or hierarchies in terms of debates, positions, arguments, labels, terminology/concepts, and user-defined keywords by means of database relations and automatically generated hyperlinks (hence ‘Hyper’ in the...
    Downloads: 20 This Week
    Last Update:
    See Project
  • 7
    File System Crawler for Elasticsearch

    File System Crawler for Elasticsearch

    Elasticsearch File System Crawler (FS Crawler)

    This crawler helps to index binary documents such as PDF, Open Office, MS Office. Local file system (or a mounted drive) crawling and indexing new files, updating existing ones, and removing old ones. Remote file system over SSH/FTP crawling. REST interface to let you “upload” your binary documents to elastic search.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    DocSearcher
    DocSearcher is a search tool for indexing and searching files on a personal computer. It uses API's to provide search functionality for common document formats. currently: Word, Excel, PDF, Libre/Open/StarOffice, RTF, Text, and HTML
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    Paperless-ng

    Paperless-ng

    A supercharged version of paperless, scan, index and archive docs

    Paperless is a simple Django application running in two parts, a Consumer (the thing that does the indexing) and a Web server (the part that lets you search & download already-indexed documents). Paper is a nightmare. Environmental issues aside, there’s no excuse for it in the 21st century. It takes up space, collects dust, doesn’t support any form of a search feature, indexing is tedious, it’s heavy and prone to damage & loss. I wrote this to make “going paperless” easier. I do not have...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Field Service Management Software | BlueFolder Icon
    Field Service Management Software | BlueFolder

    Maximize technician productivity with intuitive field service software

    Track all your service data in one easy-to-use system, enabling your team to move faster and generate more revenue for your bottom line.
  • 10
    OpenSearchServer Search Engine

    OpenSearchServer Search Engine

    An open source search engine with RESTFul API and crawlers

    OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on Windows...
    Downloads: 61 This Week
    Last Update:
    See Project
  • 11

    Object Oriented Streetmap

    C# class library for processing OpenStreetMap data

    This is a class library written in C# for processing OpenStreetMap XML file extracts into a SQLite database for routing with different vehicle types and restrictions. Before rating or contributing please see the README file for a more complete summary and a list of todos.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Marcion

    Marcion

    The study environment of ancient languages (Coptic, Greek, Latin)

    Marcion is a software forming a study environment of ancient languages (esp. Coptic, Greek, Latin) and providing many tools and resources (dictionaties, grammars, texts). Although Marcion is focused on to study the gnosticism and early christianity, it is an universal library working with various file formats and allowing to collect, organize and backup texts of any kind. Overview of gnostic sources in Coptic language delivered with Marcion: Nag Hammadi Library; Berlin Codex; Codex Tchacos...
    Downloads: 31 This Week
    Last Update:
    See Project
  • 13
    IndexFile (IFile)

    IndexFile (IFile)

    IFile, PHP based framework for indexing and search in the documents

    ... (.ods); Adobe Portable Document Format (.pdf); Text file (.txt); Web page (.htm - .html)
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14

    eLibrary

    Personalized Search Engine for Commonly Used Files

    eLibrary (electric library) is a Java software to search files and folders in an OS file system. It differs from general OS file search engines in that it personalizes the indexing setup so that users can choose which directories to index or remove from an existing index and it can also suggest queries just like Google's "Did you mean" feature. The customization of indexing and query suggestion greatly improves search speed and make user experience more comfortable. eLibrary can also extract...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    Personalized Search Engine

    Personalized Search Engine for Your Files

    MySearchEngine (Personalized Search Engine) is a Java software to search files and folders in an OS file system. It differs from general OS file search engines in that it personalizes the indexing setup so that users can choose which directories to index or remove from an existing index and it can also suggest queries just like Google's "Did you mean" feature. The customization of indexing and query suggestion greatly improves search speed and make user experience more comfortable. eLibrary can...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Omega Base

    Omega Base

    Web-based knowledge base template.

    A Knowledge Base and document management system (DMS). With strong user management, security, and file indexing for search.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    IDRA (InDexing and Retrieving Automatically) is a tool which allows indexing a wide range of text (TXT, DOC, PDF) and image annotations files (XML), query-based searching, visualizing an index, saving it for re-usability, evaluation, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Regain is a Java search engine based on Jakarta Lucene. It provides indexing and searching files for plenty of formats (HTML,XML,doc(x),xls(x),ppt(x),oo,PDF,RTF,mp3,mp4,Java). A TagLibrary eases integrating search results in your JSP based web page.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    HAWK - PDF Text Search Java Project

    HAWK - PDF Text Search Java Project

    No more support for this project - TAKE A LOOK AT FALCONSEARCH

    No more support for this project - TAKE A LOOK AT FALCONSEARCH "https://sourceforge.net/projects/falcontextsearch/"
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20

    PatentX - EPOScan extra utilities

    EPOScan ext folder utilities

    This is a software to operate some functions over the "ext" folder created by EPOScan(European Patent Office software for indexing and scanning patent document images) when the downloading option is selected. This folder is usually used by the ST33 software to convert the indexed images into ST33 standard.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Qercus
    Desktop free-form text database in which each record may contain an arbitrary collection of fields. Each field and record has its own style and colour. Efficient text searching - text is indexed as it is entered. Inspired by Blackwell Idealist.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    edocias

    Electronic Document Index And Search

    EDocIAS (Electronic Document Index And Search) is a PHP-based tool for indexing and searching files of various types. Third-party tools (tesseract, xpdf, etc.) can be configured to support any type of file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    ANts P2P
    ANts P2P realizes a third generation P2P net. It protects your privacy while you are connected and makes you not trackable, hiding your identity (ip) and crypting everything you are sending/receiving from others.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 24
    Blaze - Appliance for Solr
    Indexing and Search Appliance Powered by Apache Solr. It's major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    The FSSearchIndex Framework project provides a framework that allows application developers to write their own content based file search and indexing applications. It currently supports content extraction and indexing on Text,Word, Excel, PDF files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next