Showing 35 open source projects for "pdf data mining"

View related business solutions
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • Error to trace to log to deploy. One click. No SSH. Icon
    Error to trace to log to deploy. One click. No SSH.

    Catch the cause before the pager goes off.

    AppSignal links every error to the trace, the trace to the log, the log to the deploy that shipped it.
    Free 30 days.
  • 1
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2

    queXML

    XML Schema for questionnaires and PDF questionnaire generator

    queXML is a simple XML schema for designing questionnaires. Included are stylesheets to administer the questionnaire in PDF (paper), CASES and LimeSurvey. queXML is compatible with the DDI standard.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Canorus

    Canorus

    Music score editor

    Canorus is a free cross-platform music score editor. It supports an unlimited number and length of staffs, polyphony, a MIDI playback of notes, chord markings, lyrics, import/export filters to formats like MIDI, MusicXML, ABC Music, MusiXTeX and LilyPond
    Downloads: 27 This Week
    Last Update:
    See Project
  • 4
    OpenEXI

    OpenEXI

    EXI implementations in Java and C#

    Open source .Net (C#) / Java implementation of the W3C Efficient XML Interchange (EXI) format specification. As a corollary to XML, EXI is an alternative, very efficient format that has all of the mechanics of XML, but is much more compact and is faster to exchange. - README (about Nagasena EXI implemenation) https://www.dropbox.com/s/adh83u9z1x1czv6/README.txt?dl=0 - Nagasena EXI grammar interchange format (PDF) https://www.dropbox.com/s/etrpuchaddplq2s/EXIGram.pdf?dl=0 -...
    Downloads: 11 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    IdeoType is a book compiler that converts manuscript (XHTML) to book (PDF) on the fly.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    SBML2LaTeX

    SBML2LaTeX

    A documentation and report generator for systems biological models

    SBML2LATEX is a tool to convert files in the System Biology Markup Language SBML) format into LATEX files. A convenient online version is available, which allows the user to directly generate report from SBML in form of PDF or TeX, which can be further processed to various file types including DVI, PS, EPS, GIF, JPG, or PNG. SBML2LATEX can also be downloaded and used locally in batch mode or interactively with its Graphical User Interface or several command line options. The purpose of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    eXtensible Text Framework (XTF)

    Framework for search and display of heterogenous document collections.

    ...Please visit https://github.com/cdlib/xtf for the latest updates. Obsolete Description: The eXtensible Text Framework (XTF) is an architecture that supports searching across collections of heterogeneous textual data (XML, PDF, HTML, text, and more), and the presentation of results and documents in a highly configurable manner. Includes highly customized versions of the proven open-source components Lucene and Saxon.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    XML2CSV-Generic-Converter

    XML2CSV-Generic-Converter

    Flatten XML into CSV to suit your mood

    ...It handles attributes, repeated elements, and so on, and produces results which level up with what spreadsheets generate when they import native XML (at least in its most extensive execution mode). Please refer to the documentation for further details (PDF doc, Open Office Writer doc, and API doc). This free software is released under the GNU GENERAL PUBLIC LICENSE Version 3.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9

    XmlDoclet

    A JavaDoc doclet that outputs source code structure in XML format.

    XmlDoclet is a JavaDoc doclet that outputs the source code structure of the packages, classes etc. in XML format. Later, the XML data may easily be processed by standard tools such as XSLT to produce HTML, PDF, dot graphs etc. Technically, this is done by wrapping the class and interfaces of the com.sun.javadoc packages into JAXB annotated classes, which allows for an easy serialization.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 10

    mcf2pdf

    mcf2pdf converts files of the "My CEWE Photobook" software to PDF

    mcf2pdf converts .mcf files of the "My CEWE Photobook" software (see http://www.cewe-photobook.co.uk/ or http://www.cewe-fotobuch.de (german)) to PDF files, so you can better preview the results and even send them to others by e-mail. This project has moved to GitHub. Please visit https://github.com/albrechtf/mcf2pdf/releases for download of latest version.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 11
    S1000D Transformation Toolkit
    The S1000D Transformation Toolkit provides a reference implementation supporting the transformation, packaging and viewing of S1000D data into a SCORM 2004 3rd Edition Content Package, Mobile Web Application and PDF.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    ComicMaster is a cbr/cbz reader for comic archives. ComicMaster is able to open cbr and cbz archives and display image contents. Furthermore it has some abilities to modify existing archives. From version 0062 on the export in pdf format has been added.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    Wordpress PDF Blog Export

    Aplicación JAVA que genera un PDF a partir de un XML de Wordpress

    Pequeña aplicación desarrollada con JAVA que convierte un archivo XML generado desde Wordpress en un archivo PDF. Queria hacer un libro de mi blog, pero las herramientas que encontré no incluïan los comentarios en el documento generado. Entonces, a la par que aprendía a utilizar las librerias de java iText y jSoup desarrollé esta utilidad en un archivo jar ejecutable. Para utilizarlo simplemente necesitaremos tener java instalado en nuestro PC. El archivo pdf generado se puede utilizar...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    eLML - eLesson Markup Language
    eLML (eLesson Markup Language) is an XML framework for creating structured eLessons based on a pedagogical model. eLML consists of an XMLSchema and XSLT files to create XHTML, PDF, LaTeX, IMS CP and SCORM versions, standards supported by most LMS.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Calenco XML CMS
    Calenco is a Web collaborative platform that enable remote teams of writers, proofreader, graphic designers, translators, etc. to produce together XML documents like user guides, security procedures, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Foxon is a FO emitter/indenter to be used with Saxon. It can indent and prettify XSL-FO output, making it suitable for human inspection and editing, without introducing artefacts that change the layout of the PDF file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    xccdf2pdf renders XCCDF documents in PDF and other formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Produced by HealthQuilt Lab, this project contain tools that transform data collected through a pdf form and transforms the data into a CCR xml.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    RepEdit

    Project moved to https://sourceforge.net/projects/qsqlmon/

    Report library + visual editor for Qt based applications. Project moved to https://sourceforge.net/projects/qsqlmon/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    XTRACT4J V2 is a stand-alone, pure-Java program which creates XML file by dependent or independent SQL queries. It is designed as a drop-in replacement for Oracle Report to generate XML file. It also incorporates BI Publisher to create PDF reports.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Shared Questionnaire System
    Shared Questionnaire System(SQS) is a full-functional Optical Mark Reader(OMR) form processing system implemented in Java-Swing, XSL-FO and AJAX with straightforward GUIs. It is aimed at developing social platform to share knowledge about questionnaire.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    The ProM Import Framework allows to extract process enactment event logs from a set of information systems. These can be exported in the MXML format, which is the standard event log data format for Process Mining analysis techniques.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    Convert plain text format RFC docs into open format, such as HTML, PDF. Features: Index page link, Document reference link, Figure/Table reference link; customizable CSS.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    ERYX is a flexible, fast and easy-to-use XML content management system with a whole bunch of integration opportunities for XHTML, Flash, InDesign, PDF, and any web-based software that makes excessive use of XML-files. Needs PHP 5.2+. http://eryx.imgb.org
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    PHP based Web Map Service (WMS) and WFS implementation according to OGC's specification. Data are stored according to Simple Features Specification in WKT format and delivered (among others) in SVG, PDF, Flash, PNG, and GIF format.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next