Showing 35 open source projects for "pdf data mining"

View related business solutions
  • Atera - an All-in-one platform for IT management Icon
    Atera - an All-in-one platform for IT management

    Ideal for IT departments and MSPs (managed service providers)

    Your IT essentials, integrated & elevated. Take your IT management from automated to autonomous, download Atera's agent to start your free trial!
    Try Atera now
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 1
    Dompdf

    Dompdf

    HTML to PDF converter for PHP

    dompdf is an HTML to PDF converter. At its heart, dompdf is (mostly) a CSS 2.1 compliant HTML layout and rendering engine written in PHP. It is a style-driven renderer, it will download and read external stylesheets, inline style tags, and the style attributes of individual HTML elements. It also supports most presentational HTML attributes. PDF rendering is currently provided either by PDFLib or by a bundled version the R&OS CPDF class written by Wayne Munro. (Some important changes have...
    Downloads: 101 This Week
    Last Update:
    See Project
  • 2
    DeckTape

    DeckTape

    PDF exporter for HTML presentations

    DeckTape is a high-quality PDF exporter for HTML presentation frameworks. DeckTape is built on top of Puppeteer which relies on Google Chrome for laying out and rendering Web pages and provides a headless Chrome instance scriptable with a JavaScript API. DeckTape currently supports the following presentation frameworks out of the box. DeckTape also provides a generic command that works by emulating the end-user interaction, allowing it to be used to convert presentations from virtually any...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    Crowbook

    Crowbook

    Converts books written in Markdown to HTML, LaTeX/PDF and EPUB

    Crowbook's aim is to allow you to write a book in Markdown without worrying about formatting or typography and let the program generate HTML, PDF and EPUB output for you. Its focus is novels and fiction, and the default settings should (hopefully) generate readable books with correct typography without requiring you to worry about it. To see what Crowbook's output looks like, you can read the Crowbook guide rendered in HTML, PDF or EPUB. Crowbook will parse this file and generate HTML, EPUB,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    pagedown

    pagedown

    Paginate the HTML Output of R Markdown with CSS for Print

    Paginate the HTML Output of R Markdown with CSS for Print. You only need a modern web browser (e.g., Google Chrome or Microsoft Edge) to generate PDF. No need to install LaTeX to get beautiful PDFs. This R package stands on the shoulders of two giants to support typesetting with CSS for R Markdown documents: Paged.js and ReLaXed (we only borrowed some CSS from the ReLaXed repo and didn't really use the Node package).
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 5
    Shower Presentation Template

    Shower Presentation Template

    Shower HTML presentation engine

    Shower Presentation Template is a shower HTML presentation engine. Built on HTML, CSS and vanilla JavaScript, works in all modern browsers. Themes are separated from engine, and comes with fully keyboard accessible. Printable to PDF and includes Ribbon and Material themes, and core with plugins. You’ll need Node.js installed on your computer. Latest stable versions of Chrome, Edge, Firefox, and Safari are supported.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    Ray Tracing in One Weekend Book Series

    Ray Tracing in One Weekend Book Series

    The Ray Tracing in One Weekend series of books

    The Ray Tracing in One Weekend series of books are now available to the public for free online. They are now released under the CC0 license. This means that they are as close to public domain as we can get. (While that also frees you from the requirement of providing attribution, it would help the overall project if you could point back to this web site as a service to other users.) These books are formatted for printing directly from your browser, where you can also (on most browsers) save...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    Markdown Monster

    Markdown Monster

    An extensible Markdown Editor, Viewer and Weblog Publisher for Windows

    Markdown Monster is a powerful, yet easy-to-use Markdown editor with syntax highlighting and sophisticated and fast edit features. A collapsible, synced, live preview lets you see your output as you type and scroll. Easily embed or paste images, links, tables and code using raw markup or our smart UI helpers to simplify many operations with a few keystrokes or a click or two. Paste images from the clipboard or drag and drop from Explorer or our built-in file browser. Inline spell-checking...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    docconv

    docconv

    Converts PDF, DOC, DOCX, XML, HTML, RTF, etc to plain text

    A Go wrapper library to convert PDF, DOC, DOCX, XML, HTML, RTF, ODT, Pages documents and images (see optional dependencies below) to plain text. See go help install for details on the installation location of the installed docd executable. Make sure that the full path to the executable is in your PATH environment variable. To add image support to the docconv library you first need to install and build gosseract. Now you can add -tags ocr to any go command when building/fetching/testing...
    Downloads: 2 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    html-pdf-chrome

    html-pdf-chrome

    HTML to PDF or image (jpeg, png, webp) converter via Chrome/Chromium

    HTML to PDF or image (jpeg, png, webp) converter via Chrome/Chromium. This library is NOT meant to accept untrusted user input. Doing so may have serious security risks such as Server-Side Request Forgery (SSRF). If you run into CORS issues, try using the --disable-web-security Chrome flag, either when you start Chrome externally, or in options.chromeFlags. This option should only be used if you fully trust the code you are executing during a print job. It is strongly recommended that you...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Universal Résumé Template

    Universal Résumé Template

    Minimal and formal résumé (CV) website template for print and mobile

    Minimal and formal résumé (CV) website template for print, mobile, and desktop. The proportions are the same on the screen and paper. Built with amazing Tailwind CSS. I couldn’t find any formal or professional résumé (CV) website with good typography that is optimized for the Web, print, PDF, and mobile. Also, researching what recruiters want, my priorities were fast scanning time and all content to fit on one page. Replace every -letter with -a4, and uncomment specified code blocks....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    backslide

    backslide

    CLI tool for making HTML presentations with Remark.js using Markdown

    CLI tool for making HTML presentations with Remark.js using Markdown. Use bs init to create a new presentation along with a template directory in the current directory. The template directory is needed for backslide to transform your Markdown files into HTML presentations. You can create as many markdown presentations as you want in the directory, they will all be based on the same template. Use bs serve to start a development server with live reload. A page will automatically open in your...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    node-html-pdf

    node-html-pdf

    HTML to PDF converter that uses phantomjs

    HTML to PDF converter that uses phantomjs. html-pdf can read the header or footer either out of the footer and header config object or out of the HTML source. You can either set a default header & footer or overwrite that by appending a page number (1 based index) to the id="pageHeader" attribute of an HTML tag. You can use any combination of those tags. The library tries to find any element, that contains the page header or pageFooter id prefix. The full options object gets converted to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Myrtille

    Myrtille

    A native HTML4 / HTML5 Remote Desktop Protocol and SSH client

    Myrtille provides simple and fast access to remote desktops, applications, and SSH servers through a web browser, without any plugin, extension or configuration. Technically, Myrtille is an HTTP(S) to RDP and SSH gateway. User input (keyboard, mouse, touchscreen) is forwarded from a web browser to an HTTP(S) gateway, then up to an RDP (or SSH) client which maintains a session with an RDP (or SSH) server. The display resulting (or not) of such actions is streamed back to the browser, from the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    pdf2htmlEX

    pdf2htmlEX

    Convert PDF to HTML without losing text or format

    pdf2htmlEX renders PDF files in HTML, utilizing modern Web technologies. It aims to provide an accurate rendering, while being optimized for Web display. Text, fonts and formats are natively preserved in HTML. Mathematical formulas, figures and images are also supported. pdf2htmlEX is also a publishing tool: almost 50 options make it flexible for many different use cases: PDF preview, book/magazine publishing, personal resume. pdf2htmlEX is optimized for modern web browsers such as Mozilla...
    Downloads: 37 This Week
    Last Update:
    See Project
  • 16
    wkhtmltopdf

    wkhtmltopdf

    Convert HTML to PDF using Webkit (QtWebKit)

    wkhtmltopdf and wkhtmltoimage are open source (LGPLv3) command line tools to render HTML into PDF and various image formats using the Qt WebKit rendering engine. These run entirely "headless" and do not require a display or display service. There is also a C library, if you're into that kind of thing. The file pdf.h contains a fairly high level and stable pure c binding to wkhtmltopdf. These binding are well documented and do not depend on QT. Using this is the recommended way of interfacing...
    Downloads: 51 This Week
    Last Update:
    See Project
  • 17
    posterdown

    posterdown

    Use RMarkdown to generate PDF Conference Posters via HTML

    Welcome to Posterdown! This is my attempt to provide a semi-smooth workflow for those who wish to take their RMarkdown skills to the conference world. Many creature comforts from RMarkdown are available in this package such as Markdown section notation, figure captioning, and even citations like this one (Allaire, Xie, McPherson, et al. 2018). The rest of this example poster will show how you can insert typical conference poster features into your own document. Posterdown was created as a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    IdeoType is a book compiler that converts manuscript (XHTML) to book (PDF) on the fly.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    DinkToPdf

    DinkToPdf

    C# .NET Core wrapper for wkhtmltopdf library that uses Webkit engine

    .NET Core P/Invoke wrapper for wkhtmltopdf library that uses Webkit engine to convert HTML pages to PDF. Copy the native library to root folder of your project. From there .NET Core loads the native library when the native method is called with P/Invoke. You can find the latest version of the native library. Select the appropriate library for your OS and platform (64 or 32-bit). The library was not tested with IIS. The library was tested in console applications and with Kestrel web server...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    html2pdf
    HTML2PDF is a PHP class using FPDF for the PHP4 release, and TCPDF for the PHP5 release. It can convert valid HTML and xHTML to PDF. More details and examples on http://html2pdf.fr/ HTML2PDF is now on GitHub : https://github.com/spipu/html2pdf/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Shelk-test
    Open Source program for creating tests, which will be a compile of test and testing. It can be used by anyone who want to quickly create test and make testing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    RochaReport

    Geração de relatórios configuráveis em C++, criação de PDF, HTML

    Este projeto cria relatórios com template configurável, a partir de um conjunto de dados ele preenche um html predefinido que o converte para PDF
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    xccdf2pdf renders XCCDF documents in PDF and other formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    APHID is an easy-to-install, easy-to-use DocBook environment. APHID transforms source documents (text or XML) into multiple output formats (HTML, PDF, HTML Help, etc.). APHID is a derivative work of eDE (http://www.e-novative.de).
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next