Showing 45 open source projects for "document search engine"

View related business solutions
  • Find Hidden Risks in Windows Task Scheduler Icon
    Find Hidden Risks in Windows Task Scheduler

    Free diagnostic script reveals configuration issues, error patterns, and security risks. Instant HTML report.

    Windows Task Scheduler might be hiding critical failures. Download the free JAMS diagnostic tool to uncover problems before they impact production—get a color-coded risk report with clear remediation steps in minutes.
    Download Free Tool
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 1
    Nokogiri

    Nokogiri

    Tool to work with XML and HTML from Ruby

    Nokogiri (鋸) makes it easy and painless to work with XML and HTML from Ruby. It provides a sensible, easy-to-understand API for reading, writing, modifying, and querying documents. It is fast and standards-compliant by relying on native parsers like libxml2 (C) and xerces (Java). Be secure-by-default by treating all documents as untrusted by default. Be a thin-as-reasonable layer on top of the underlying parsers, and don't attempt to fix behavioral differences between the parsers. "Native...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Floki

    Floki

    Floki is a simple HTML parser that enables search for nodes using CSS

    Floki is a simple HTML parser that enables search for nodes using CSS selectors. Floki needs the :leex module in order to compile. Normally this module is installed with Erlang in a complete installation. By default, Floki uses a patched version of mochiweb_html for parsing fragments due to its ease of installation (it's written in Erlang and has no outside dependencies). fast_html is generally faster, according to the benchmarks conducted by its developers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    XRender

    XRender

    Easy-to-use middle and back-end "form/table/chart" solution

    Alibaba, fliggy middle and back-end "forms/tables/graphs" out-of-the-box solution. Use XRender in company or personal projects and help promote it to partners. FormRender 1.0 is the next generation React.jsform solution. The project has been rewritten from the kernel level, in order to effectively undertake the requirements of increasingly complex form scenarios. Our goal is to support 100% coverage of form scenarios with strong scalability, while keeping developers up to speed quickly, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    Create Index from PDF

    PDF Indexing Script: Searches PDF for words, records page numbers

    This Python script helps automate the process of creating an index for a PDF document. It reads a list of words from a text file, searches through each page of the PDF, and records the page numbers where each word appears. The script accounts for the first 24 pages of the PDF that use Roman numerals (i-xxiv) and adjusts the page numbers accordingly. It is designed to be case-insensitive, ensuring that variations in capitalization do not affect the search results.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Atera all-in-one platform IT management software with AI agents Icon
    Atera all-in-one platform IT management software with AI agents

    Ideal for internal IT departments or managed service providers (MSPs)

    Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
    Learn More
  • 5
    gSOAP Toolkit

    gSOAP Toolkit

    Development toolkit for Web Services and XML data bindings for C & C++

    The gSOAP toolkit is an extensive suite of portable C and C++ software to develop XML Web services with powerful type-safe XML data bindings. Easy-to-use code-generator tools allow you to directly integrate XML data in C and C++. Serializes native application data in XML. Includes WSDL/XSD schema binding and auto-coding tools, stub/skeleton compiler, Web server integration with Apache module and IIS extension, high-performance XML processing with schema validation, fast MIME/MTOM streaming,...
    Leader badge
    Downloads: 657 This Week
    Last Update:
    See Project
  • 6

    xsd2pgschema

    Relational database replication tool based on XML Schema

    xsd2pgschema is a Java application suite, which converts XML Schema 1.1 (hierarchical data model) to PostgreSQL DDL (relational data model) and supports XML data migration into PostgreSQL based on the XML Schema without defects on information content. It also supports full-text indexing via either Apache Lucene or Sphinx Search utilizing the relational data model. File conversion from XML to CSV, TSV, or JSON is possible as well as mapping XML Schema to JSON Schema. Obtained PostgreSQL...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    Plot

    Plot

    A DSL for writing type-safe HTML, XML and RSS in Swift

    ...Plot added both all of the necessary attributes to load the requested CSS stylesheet, along with additional metadata for the page’s title as well, improving page rendering, social media sharing, and search engine optimization. Attributes can also be applied the exact same way as child elements are added, by simply adding another entry to an element’s comma-separated list of content.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    TeXnicCenter

    TeXnicCenter

    A feature-rich environment for writing LaTeX documents on Windows

    TeXnicCenter is a LaTeX editor on Windows. Navigating LaTeX documents is simple due to the automatically created document outline. Errors of the LaTeX compilation can be reviewed instantly. TXC features autocompletion and comes with LaTeX templates.
    Leader badge
    Downloads: 248 This Week
    Last Update:
    See Project
  • 9
    QXmlEdit

    QXmlEdit

    Simple XML editor and XSD viewer

    QXmlEdit is a simple XML editor written in qt. Its main features are unusual data visualization modes, nice XML manipulation and presentation and it is multi platform. It can split very big XML files into fragments, compare XML and XSD files, and has a graphical XSD viewers. Project site: http://qxmledit.org Source code hosted at GitHub (moved from Google Code) https://github.com/lbellonda/qxmledit Report issues at: https://github.com/lbellonda/qxmledit/issues Discussion...
    Leader badge
    Downloads: 172 This Week
    Last Update:
    See Project
  • Free and Open Source HR Software Icon
    Free and Open Source HR Software

    OrangeHRM provides a world-class HRIS experience and offers everything you and your team need to be that HR hero you know that you are.

    Give your HR team the tools they need to streamline administrative tasks, support employees, and make informed decisions with the OrangeHRM free and open source HR software.
    Learn More
  • 10
    XML Tree Editor

    XML Tree Editor

    Basic cross-platform tree view XML editor

    XMLTreeEdit displays XML files as tree views and allows basic operations: adding, editting and deleting text nodes and their attributes. The main goal is providing a simple tool to create/edit XML configuration files for users without knowledge of XML. Built in Free Pascal Lazarus, which allows easy compilation for different target platforms. Currently binary executables were produced and tested on Windows (XP, 7) and Ubuntu Linux (GTK2). For developers: there are two units listed...
    Downloads: 24 This Week
    Last Update:
    See Project
  • 11
    CSSBox

    CSSBox

    Pure Java HTML / CSS rendering engine

    CSSBox is an (X)HTML/CSS rendering engine written in pure Java. Its primary purpose is to provide a complete information about the rendered page suitable for further processing. However, it also allows displaying the rendered document.
    Downloads: 24 This Week
    Last Update:
    See Project
  • 12
    wkhtmltopdf

    wkhtmltopdf

    Convert HTML to PDF using Webkit (QtWebKit)

    ...The rest of the headers directly exposes the C++ QT dependent class used internally by wkhtmltopdf and wkhtmltoimage. wkhtmltopdf is able to put several objects into the output file, an object is either a single webpage, a cover webpage or a table of contents. The objects are put into the output document in the order they are specified on the command line, options can be specified on a per object basis or in the global options area.
    Downloads: 72 This Week
    Last Update:
    See Project
  • 13
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    ...With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. The latest versions of iText build on the success of previous versions and feature an improved document engine, high and low-level programming capabilities, and a more efficient modular structure. iText represents the next level for developers looking to leverage PDF in document workflows. The main project page for iText is now on GitHub, and all the latest releases, code samples, open source add-ons and tools, etc. can be found at https://github.com/itext/.
    Leader badge
    Downloads: 203 This Week
    Last Update:
    See Project
  • 14
    cquery

    cquery

    C/C++ language server supporting multi-million line code base

    C/C++ language server supporting multi-million line code base, powered by libclang. Emacs, Vim, VSCode, and others with language server protocol support. Cross-references, completion, diagnostics, semantic highlighting, and more. cquery is a highly-scalable, low-latency language server for C/C++/Objective-C. It is tested and designed for large codebases like Chromium. cquery provides accurate and fast semantic analysis without interrupting workflow. cquery implements almost the entire...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    CSVboard

    CSV editor to open CSV files with minimum effort

    CSVboard is a tool for loading CSV files with minimum effort. Since I placed a great importance on its easy of use I implemented a search and filtering engine which provides you with the facility to efficiently find specified rows within a table. DON'T FORGET TO READ THE QUICK TUTORIAL!! Features: Lightweight and portable Set and reset title with Ctrl+q and Ctrl+w Auto-set column widths Auto-detection of delimiters Load files by Dragging&Dropping XML export Powerful search and filtering engine CSVboard was written in Java and JavaFX and actually it is a result of some recycled code collections. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16
    DinkToPdf

    DinkToPdf

    C# .NET Core wrapper for wkhtmltopdf library that uses Webkit engine

    .NET Core P/Invoke wrapper for wkhtmltopdf library that uses Webkit engine to convert HTML pages to PDF. Copy the native library to root folder of your project. From there .NET Core loads the native library when the native method is called with P/Invoke. You can find the latest version of the native library. Select the appropriate library for your OS and platform (64 or 32-bit). The library was not tested with IIS. The library was tested in console applications and with Kestrel web server...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    eXtensible Text Framework (XTF)

    Framework for search and display of heterogenous document collections.

    NOTICE: This code repository is deprecated. Please visit https://github.com/cdlib/xtf for the latest updates. Obsolete Description: The eXtensible Text Framework (XTF) is an architecture that supports searching across collections of heterogeneous textual data (XML, PDF, HTML, text, and more), and the presentation of results and documents in a highly configurable manner. Includes highly customized versions of the proven open-source components Lucene and Saxon.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Nibbleblog

    Nibbleblog

    Powerful engine for creating blogs, all you need is PHP to work.

    Easy, fast and free CMS Blog. Nibbleblog it's a powerful engine for creating blogs, all you need is PHP to work. Very simple to install and configure (only 1 step).
    Downloads: 8 This Week
    Last Update:
    See Project
  • 19
    Regain is a Java search engine based on Jakarta Lucene. It provides indexing and searching files for plenty of formats (HTML,XML,doc(x),xls(x),ppt(x),oo,PDF,RTF,mp3,mp4,Java). A TagLibrary eases integrating search results in your JSP based web page.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 20
    Syndicateme.net ... Ajax Atom 1.0 Syndication Engine Tell your story ... Especially if you are a business along Queen St. in Toronto Canada or King Street Waterloo Canada. Syndication can be from a pop mailbox, and can use XInclude.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21

    Researchers Ontology

    Researchers Ontology Search Engine

    This project is a search engine that gathers data from an ontology. We took as exemple an ontology of researchers. The search uses the properties and works with partial queries and substrings. The client searching doesn't have to use the DLQuery nor the Manchester OWL Syntax, the engine we made builds the query in these syntaxes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    hcxselect

    A CSS selector engine for C++

    hcxselect is a small and fast CSS selector engine for C++. It parses CSS selector expressions and applies them to a set of document nodes (or a whole tree) parsed via htmlcxx, a simple non-validating HTML parser. Thus, it allows you to use CSS selectors in your C++ program without much bloat.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    knowceans

    Utility classes from maps to search engine to random samplers

    .... --- Highlights: --- org.knowceans.util: IndexQuickSort, TableList: apply order of one array/list to others +++ Vectors, ArrayUtils: array convenience +++ RandomSamplers, CokusRandom, ArmSampler, Densities: random sampling and distributions +++ Arguments: command line parser +++ StopWatch, Which, ExternalProcess: runtime stuff +++ ParallelFor: OpenMP workalike +++ PatternString, NamedGroupRegex: regex convenience --- org.knowceans.corpus: CorpusSearcher: full-text search engine +++ LabelNumCorpus: svmlight corpus storage and filtering +++ NIPS corpus with text, authors, labels and citations --- org.knowceans.map: InvertibleHashMultiMap, BijectiveHashMap: implement n:m and 1:1 relations. --- Other libs: knowceans-arms = port of the Adaptive Rejection Metropolis Sampler (ARMS) for arbitrary distributions +++ lda-j = port of lda-c, implementing Latent Dirichlet Allocation (LDA)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    XmlView
    GUI utility in pure Java for viewing and editing XML content; example of application built with Superficial http://superficial.sourceforge.net
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    PHP-Index uses a plain text file as an index for a efficient search on data. The index is a simple ordered list, so a binary search can be performed. Current implementation supports a XML document as database.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next