Showing 28 open source projects for "indexing documents"

View related business solutions
  • Top-Rated Free CRM Software Icon
    Top-Rated Free CRM Software

    216,000+ customers in over 135 countries grow their businesses with HubSpot

    HubSpot is an AI-powered customer platform with all the software, integrations, and resources you need to connect your marketing, sales, and customer service. HubSpot's connected platform enables you to grow your business faster by focusing on what matters most: your customers.
    Get started free
  • Recruit and Manage your Workforce Icon
    Recruit and Manage your Workforce

    Evolia makes it easier to hire, schedule and track time worked by frontline in medium and large-sized businesses.

    Evolia is a web and mobile platform that connects enterprises with 1000’s of local shift workers and offers free workforce scheduling and time and attendance solutions. Is your business on Evolia?
    Learn More
  • 1
    Morphia

    Morphia

    MongoDB object-document mapper in Java

    MongoDB Object Document Mapping for the JVM. Bidirectional mapping to and from the database. Transparently map your Java entities to MongoDB documents and back.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    bleve

    bleve

    A modern text indexing library for go

    Import one package, build an index with three lines of code, query for documents with another three lines. Bleve includes general-purpose analyzers as well as pre-built text analyzers for the following languages, Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Norwegian, Persian, Portuguese, Romanian, Russian, Sorani, Spanish, Swedish, Thai, and Turkish. Support for aggregating facet information across search results. Supported facet types include Terms Facet, Numeric Range...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Haystack

    Haystack

    Haystack is an open source NLP framework to interact with your data

    Apply the latest NLP technology to your own data with the use of Haystack's pipeline architecture. Implement production-ready semantic search, question answering, summarization and document ranking for a wide range of NLP applications. Evaluate components and fine-tune models. Ask questions in natural language and find granular answers in your documents using the latest QA models with the help of Haystack pipelines. Perform semantic search and retrieve ranked documents according to meaning...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    txtai

    txtai

    Build AI-powered semantic search applications

    ..., models can understand concepts in documents, audio, images and more. Machine-learning pipelines to run extractive question-answering, zero-shot labeling, transcription, translation, summarization and text extraction. Cloud-native architecture that scales out with container orchestration systems (e.g. Kubernetes). Applications range from similarity search to complex NLP-driven data extractions to generate structured databases. The following applications are powered by txtai.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Tigerpaw One | Business Automation Software for SMBs Icon
    Tigerpaw One | Business Automation Software for SMBs

    Fed up with not having the time, money and resources to grow your business?

    The only software you need to increase cash flow, optimize resource utilization, and take control of your assets and inventory.
    Learn More
  • 5
    OpenKM Document Management - DMS

    OpenKM Document Management - DMS

    Document Management System and Content Management System

    .... Due to its technological architecture design, OpenKM meets the document management needs of businesses of all sizes (from SMEs to big corporations). Thanks to its elegant and intuitive interface, OpenKM transforms complex operations into easy tasks. The most relevant functions of OpenKM is the indexing of the most common types of files: text, Office, Office 2007, OpenOffice, PDF, HTML, XML, MP3, JPEG, etc. For a complete feature list take a look at http://goo.gl/au8cQy
    Leader badge
    Downloads: 998 This Week
    Last Update:
    See Project
  • 6

    xsd2pgschema

    Relational database replication tool based on XML Schema

    xsd2pgschema is a Java application suite, which converts XML Schema 1.1 (hierarchical data model) to PostgreSQL DDL (relational data model) and supports XML data migration into PostgreSQL based on the XML Schema without defects on information content. It also supports full-text indexing via either Apache Lucene or Sphinx Search utilizing the relational data model. File conversion from XML to CSV, TSV, or JSON is possible as well as mapping XML Schema to JSON Schema. Obtained PostgreSQL...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    File System Crawler for Elasticsearch

    File System Crawler for Elasticsearch

    Elasticsearch File System Crawler (FS Crawler)

    This crawler helps to index binary documents such as PDF, Open Office, MS Office. Local file system (or a mounted drive) crawling and indexing new files, updating existing ones, and removing old ones. Remote file system over SSH/FTP crawling. REST interface to let you “upload” your binary documents to elastic search.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Paperless-ng

    Paperless-ng

    A supercharged version of paperless, scan, index and archive docs

    Paperless is a simple Django application running in two parts, a Consumer (the thing that does the indexing) and a Web server (the part that lets you search & download already-indexed documents). Paper is a nightmare. Environmental issues aside, there’s no excuse for it in the 21st century. It takes up space, collects dust, doesn’t support any form of a search feature, indexing is tedious, it’s heavy and prone to damage & loss. I wrote this to make “going paperless” easier. I do not have...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Records Management System

    Records Management System

    Save a digital copy of your personal or business records

    Your personal and business records are considered private documents. You should avoid using cloud providers, even google drive! Records Management System is a localized data store using SQLITE and integrates with any connected scanner using the TWAIN toolkit (a license may be needed?) The resource that RMS is based is from the book Filing Systems and Records Management (College series) 3rd Edition available on Amazon at https://www.amazon.com/dp/0070614717
    Downloads: 3 This Week
    Last Update:
    See Project
  • Cybersecurity Management Software for MSPs Icon
    Cybersecurity Management Software for MSPs

    Secure your clients from cyber threats.

    Define and Deliver Comprehensive Cybersecurity Services. Security threats continue to grow, and your clients are most likely at risk. Small- to medium-sized businesses (SMBs) are targeted by 64% of all cyberattacks, and 62% of them admit lacking in-house expertise to deal with security issues. Now technology solution providers (TSPs) are a prime target. Enter ConnectWise Cybersecurity Management (formerly ConnectWise Fortify) — the advanced cybersecurity solution you need to deliver the managed detection and response protection your clients require. Whether you’re talking to prospects or clients, we provide you with the right insights and data to support your cybersecurity conversation. From client-facing reports to technical guidance, we reduce the noise by guiding you through what’s really needed to demonstrate the value of enhanced strategy.
    Learn More
  • 10
    json-rust

    json-rust

    JSON implementation in Rust

    Parse and serialize JSON with ease. JSON is a very loose format where anything goes - arrays can hold mixed types, object keys can change types between API calls or not include some keys under some conditions. Mapping that to idiomatic Rust structs introduces friction.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    OpenSearchServer Search Engine

    OpenSearchServer Search Engine

    An open source search engine with RESTFul API and crawlers

    OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 12

    Indexmeister

    automatic indexing for large LaTex documents

    Indexmeister reads a variety of formats (.tex, .docx, .epub, and others) and suggests keywords for indexing. The included program Imbrowse provides a semi-automatic interface to rapidly add index tags to multi-file latex documents.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    SummitDB

    SummitDB

    In-memory NoSQL database with ACID transactions, Raft consensus, etc.

    SummitDB is an in-memory, NoSQL key/value database. It persists to disk, uses the Raft consensus algorithm, is ACID compliant, and is built on a transactional and strongly-consistent model. It supports custom indexes, geospatial data, JSON documents, and user-defined JS scripting. The easiest way to get SummitDB is to use one of the pre-built release binaries which are available for OSX, Linux, and Windows. SummitDB can be compiled and used on Linux, OSX, Windows, FreeBSD, ARM (Raspberry PI...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    IndexFile (IFile)

    IndexFile (IFile)

    IFile, PHP based framework for indexing and search in the documents

    Index documents using Lucene Seach Engine or the MySql Full-Text. IFile supports many type of documents: Rich Text Format (.rtf); Moving Picture Expert Group-1/2 Audio Layer 3 (.mp3); Joint Photographic Experts Group (.jpg - .jpeg); Tagged Image File Format (.tiff); Microsoft Word 97-2000 (.doc); Microsoft Word 2003-2007 (.docx); Microsoft Excel 97-2000 (.xls); Microsoft Excel 2003-2007 (.xlsx); Microsoft PowerPint 2003-2007 (.pptx); OpenOffice.org Writer (.odt...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    Arabic Desktop Search Engine

    desktop search engine

    hello this is an desktop search engine target Arabic search engine also can work with other languages, this application use lucene.net for indexing and searching html file documents, developed with visual studio 2013. http://www.mediafire.com/download/p3lcez1h93pcpd8/ArDesktopSearch_SourceCode.7z The application strip Arabic diacritics when indexing html files also able to Highlight match founded texts with diacritics and without it using EasyMark highlighter JavaScript plugin...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    The Arabic corpus has been developed as part of a research project named "A New Approach of Semi-Indexing of Text Documents". This corpus consists of more than 460 Arab books. Arabic corpus can be used for the development of language engineering applications, information retrieval and information extraction. The total corpus size is 137 MB It contains 23,264,785 words and more than 128,584,458 letters.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Constellio Enterprise Search engine

    Constellio Enterprise Search engine

    Open source Search Engine and Enterprise Search

    Constellio is an enterprise search engine that allows companies to search all their organization's information through a single interface (Web, CRM, ERP, ECM, Mail etc.). Constellio is Based on Apache Solr and Google Search Appliance's connector. Constellio has a powerful web crawler.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    Restad

    Relational storage for tagged documents

    Restad is an indexing-querying tool for tagged documents. It uses a relational database for storage and querying. See the last news on the blog : https://sourceforge.net/p/restad/blog/ The Ruby first prototype can be found there : https://github.com/ymoreau/Restad
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Personal Document Manager DMS
    Personal Document Manager is a tool for storing, indexing and finding documents. In short: a simple, small Document Management System. Userguide at Project Webpage. Currently I am looking for people willing to join the project. Needed are: - Java developers. - Testers - Help writers Especially developers working on windows would be needet to synchronise some window management things which are different on UX based systems and Windows. And of course.... working alone...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Linoratix Intranet Search (lIntraSearch) is an Desktopsearch alike application indexing the content of many office- and other files in a network so you can search, download and modify all the documents in your network from your computer.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    edocias

    Electronic Document Index And Search

    EDocIAS (Electronic Document Index And Search) is a PHP-based tool for indexing and searching files of various types. Third-party tools (tesseract, xpdf, etc.) can be configured to support any type of file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    ANts P2P
    ANts P2P realizes a third generation P2P net. It protects your privacy while you are connected and makes you not trackable, hiding your identity (ip) and crypting everything you are sending/receiving from others.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 23
    Chamonix is a CHM viewer for Mac OS X 10.4. It is an Objective-C app that uses CHM lib (http://sourceforge.net/projects/chmlib), Cocoa and WebKit. It supports ToC, Indexing, Search and Favourites and multiple CHM documents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Web site code (utilizing PHP/MySQL) for a fully automated online document library/repository with cataloging, indexing, and stats. Documents are expected to be in XML format. XSL transforms for particular XML DTDs are also available.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Lightspoke Dbx is a fast XML database web service layer over Berkeley DB XML. Dbx offers xpath queries, native xml document management and web based administration of containers, documents and indexes. XSLT, security, full text indexing, xupdate planned
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next