Showing 34 open source projects for "indexing documents"

View related business solutions
  • Red Hat Ansible Automation Platform on Microsoft Azure Icon
    Red Hat Ansible Automation Platform on Microsoft Azure

    Red Hat Ansible Automation Platform on Azure allows you to quickly deploy, automate, and manage resources securely and at scale.

    Deploy Red Hat Ansible Automation Platform on Microsoft Azure for a strategic automation solution that allows you to orchestrate, govern and operationalize your Azure environment.
  • Find out just how much your login box can do for your customer | Auth0 Icon
    Find out just how much your login box can do for your customer | Auth0

    With over 53 social login options, you can fast-track the signup and login experience for users.

    From improving customer experience through seamless sign-on to making MFA as easy as a click of a button – your login box must find the right balance between user convenience, privacy and security.
  • 1
    bleve

    bleve

    A modern text indexing library for go

    Import one package, build an index with three lines of code, query for documents with another three lines. Bleve includes general-purpose analyzers as well as pre-built text analyzers for the following languages, Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian, Persian, Portuguese, Romanian, Russian, Sorani, Spanish, Swedish, Thai, and Turkish. Support for aggregating facet information across search results. Supported facet types include Terms Facet, Numeric Range...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    Morphia

    Morphia

    MongoDB object-document mapper in Java

    MongoDB Object Document Mapping for the JVM. Bidirectional mapping to and from the database. Transparently map your Java entities to MongoDB documents and back.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    txtai

    txtai

    Build AI-powered semantic search applications

    ..., models can understand concepts in documents, audio, images and more. Machine-learning pipelines to run extractive question-answering, zero-shot labeling, transcription, translation, summarization and text extraction. Cloud-native architecture that scales out with container orchestration systems (e.g. Kubernetes). Applications range from similarity search to complex NLP-driven data extractions to generate structured databases. The following applications are powered by txtai.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    OpenKM Document Management - DMS

    OpenKM Document Management - DMS

    Document Management System and Content Management System

    ... technological architecture design, OpenKM meets the document management needs of businesses of all sizes (from SMEs to big corporations). Thanks to its elegant and intuitive interface, OpenKM transforms complex operations into easy tasks. The most relevant functions of OpenKM is the indexing of the most common types of files: text, Office, Office 2007, OpenOffice, PDF, HTML, XML, MP3, JPEG, etc. For a complete feature list take a look at http://goo.gl/au8cQy
    Leader badge
    Downloads: 663 This Week
    Last Update:
    See Project
  • NeoLoad is a very comprehensive tool if you are looking for a performance test tool for web applications and other applications Icon
    Your applications are all built differently, but they all need to perform. NeoLoad simplifies and scales performance testing for everything, from APIs and microservices, to end-to-end application testing through innovative protocol and browser-based capabilities.
  • 5
    PdfgrepGui

    PdfgrepGui

    This is a simple GUI for the command line tool grep and pdfgrep

    This program is a GUI for the command line tool grep and pdfgrep. Pdfgrep search text in multiple PDF files and grep can serach text in multiple text files. You can use regular expressions for the search (https://en.wikipedia.org/wiki/Regular_expression). This GUI and the command line tools work without indexing. The following options are used: -i (ignore case) and -F (fixed strings), -n (Print page number or output lines) and -H (Print the file name for each match) from the command line...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 6

    xsd2pgschema

    Relational database replication tool based on XML Schema

    xsd2pgschema is a Java application suite, which converts XML Schema 1.1 (hierarchical data model) to PostgreSQL DDL (relational data model) and supports XML data migration into PostgreSQL based on the XML Schema without defects on information content. It also supports full-text indexing via either Apache Lucene or Sphinx Search utilizing the relational data model. File conversion from XML to CSV, TSV, or JSON is possible as well as mapping XML Schema to JSON Schema. Obtained PostgreSQL database...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    File System Crawler for Elasticsearch

    File System Crawler for Elasticsearch

    Elasticsearch File System Crawler (FS Crawler)

    This crawler helps to index binary documents such as PDF, Open Office, MS Office. Local file system (or a mounted drive) crawling and indexing new files, updating existing ones, and removing old ones. Remote file system over SSH/FTP crawling. REST interface to let you “upload” your binary documents to elastic search.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Paperless-ng

    Paperless-ng

    A supercharged version of paperless, scan, index and archive docs

    Paperless is a simple Django application running in two parts, a Consumer (the thing that does the indexing) and a Web server (the part that lets you search & download already-indexed documents). Paper is a nightmare. Environmental issues aside, there’s no excuse for it in the 21st century. It takes up space, collects dust, doesn’t support any form of a search feature, indexing is tedious, it’s heavy and prone to damage & loss. I wrote this to make “going paperless” easier. I do not have...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    filofant is an archiving and indexing server for e-mails, attachments and other documents stored on various locations in your company. The indexed documents are accessible by a customizable web frontend like an internet search engine.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Powerful small business accounting software Icon
    Powerful small business accounting software

    For small businesses looking for desktop accounting software

    With AccountEdge, business owners can organize, process, and report on their financial information so they can focus on their business. Features include: accounting, integrated payroll, sales and purchases, contact management, inventory tracking, time billing, and more.
  • 10
    Records Management System

    Records Management System

    Save a digital copy of your personal or business records

    Your personal and business records are considered private documents. You should avoid using cloud providers, even google drive! Records Management System is a localized data store using SQLITE and integrates with any connected scanner using the TWAIN toolkit (a license may be needed?) The resource that RMS is based is from the book Filing Systems and Records Management (College series) 3rd Edition available on Amazon at https://www.amazon.com/dp/0070614717
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    json-rust

    json-rust

    JSON implementation in Rust

    Parse and serialize JSON with ease. JSON is a very loose format where anything goes - arrays can hold mixed types, object keys can change types between API calls or not include some keys under some conditions. Mapping that to idiomatic Rust structs introduces friction.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    OpenSearchServer Search Engine

    OpenSearchServer Search Engine

    An open source search engine with RESTFul API and crawlers

    OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 13

    Indexmeister

    automatic indexing for large LaTex documents

    Indexmeister reads a variety of formats (.tex, .docx, .epub, and others) and suggests keywords for indexing. The included program Imbrowse provides a semi-automatic interface to rapidly add index tags to multi-file latex documents.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    SummitDB

    SummitDB

    In-memory NoSQL database with ACID transactions, Raft consensus, etc.

    SummitDB is an in-memory, NoSQL key/value database. It persists to disk, uses the Raft consensus algorithm, is ACID compliant, and is built on a transactional and strongly-consistent model. It supports custom indexes, geospatial data, JSON documents, and user-defined JS scripting. The easiest way to get SummitDB is to use one of the pre-built release binaries which are available for OSX, Linux, and Windows. SummitDB can be compiled and used on Linux, OSX, Windows, FreeBSD, ARM (Raspberry PI...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    IndexFile (IFile)

    IndexFile (IFile)

    IFile, PHP based framework for indexing and search in the documents

    Index documents using Lucene Seach Engine or the MySql Full-Text. IFile supports many type of documents: Rich Text Format (.rtf); Moving Picture Expert Group-1/2 Audio Layer 3 (.mp3); Joint Photographic Experts Group (.jpg - .jpeg); Tagged Image File Format (.tiff); Microsoft Word 97-2000 (.doc); Microsoft Word 2003-2007 (.docx); Microsoft Excel 97-2000 (.xls); Microsoft Excel 2003-2007 (.xlsx); Microsoft PowerPint 2003-2007 (.pptx); OpenOffice.org Writer (.odt); OpenOffice.org Calc...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16

    Arabic Desktop Search Engine

    desktop search engine

    hello this is an desktop search engine target Arabic search engine also can work with other languages, this application use lucene.net for indexing and searching html file documents, developed with visual studio 2013. http://www.mediafire.com/download/p3lcez1h93pcpd8/ArDesktopSearch_SourceCode.7z The application strip Arabic diacritics when indexing html files also able to Highlight match founded texts with diacritics and without it using EasyMark highlighter JavaScript plugin...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    COAR-DMS

    COAR-DMS

    DMS for linux, C++ library, server, webUI , SOAP

    COAR-DMS is document management system for 32/64 bit. linux. Acts as library, server and tools. Library features: - storage management, free pages recycling - transaction log - indexing: full text, tags, metadata, document attributes - inverted index - versioning, collaboration - document trees, trees versionning - folders - plugins for auth (PAM,LDAP), db, file types plugins - tags - metadata (key value pairs) - object level security, folders documents ACL, - unix like security (rwx...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    The Arabic corpus has been developed as part of a research project named "A New Approach of Semi-Indexing of Text Documents". This corpus consists of more than 460 Arab books. Arabic corpus can be used for the development of language engineering applications, information retrieval and information extraction. The total corpus size is 137 MB It contains 23,264,785 words and more than 128,584,458 letters.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Constellio Enterprise Search engine

    Constellio Enterprise Search engine

    Open source Search Engine and Enterprise Search

    Constellio is an enterprise search engine that allows companies to search all their organization's information through a single interface (Web, CRM, ERP, ECM, Mail etc.). Constellio is Based on Apache Solr and Google Search Appliance's connector. Constellio has a powerful web crawler.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    Restad

    Relational storage for tagged documents

    Restad is an indexing-querying tool for tagged documents. It uses a relational database for storage and querying. See the last news on the blog : https://sourceforge.net/p/restad/blog/ The Ruby first prototype can be found there : https://github.com/ymoreau/Restad
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Personal Document Manager DMS
    Personal Document Manager is a tool for storing, indexing and finding documents. In short: a simple, small Document Management System. Userguide at Project Webpage. Currently I am looking for people willing to join the project. Needed are: - Java developers. - Testers - Help writers Especially developers working on windows would be needet to synchronise some window management things which are different on UX based systems and Windows. And of course.... working alone is boring ^^
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Linoratix Intranet Search (lIntraSearch) is an Desktopsearch alike application indexing the content of many office- and other files in a network so you can search, download and modify all the documents in your network from your computer.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    edocias

    Electronic Document Index And Search

    EDocIAS (Electronic Document Index And Search) is a PHP-based tool for indexing and searching files of various types. Third-party tools (tesseract, xpdf, etc.) can be configured to support any type of file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Sukija is a program for indexing text documents written in Finnish. This project has been moved to GitHub: https://github.com/ahomansikka/sukija
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Malaga-fi is a Nutch plugin for indexing documents written in Finnish. It analyses words morphologically and indexes only the base forms (that you find in dictionaries) so that you find all inflections of a word by just searching for the base form.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next