Showing 28 open source projects for "pdf data mining"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • 1
    Crowbook

    Crowbook

    Converts books written in Markdown to HTML, LaTeX/PDF and EPUB

    Crowbook's aim is to allow you to write a book in Markdown without worrying about formatting or typography and let the program generate HTML, PDF and EPUB output for you. Its focus is novels and fiction, and the default settings should (hopefully) generate readable books with correct typography without requiring you to worry about it. To see what Crowbook's output looks like, you can read the Crowbook guide rendered in HTML, PDF or EPUB. Crowbook will parse this file and generate HTML, EPUB,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    pagedown

    pagedown

    Paginate the HTML Output of R Markdown with CSS for Print

    Paginate the HTML Output of R Markdown with CSS for Print. You only need a modern web browser (e.g., Google Chrome or Microsoft Edge) to generate PDF. No need to install LaTeX to get beautiful PDFs. This R package stands on the shoulders of two giants to support typesetting with CSS for R Markdown documents: Paged.js and ReLaXed (we only borrowed some CSS from the ReLaXed repo and didn't really use the Node package).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Shower Presentation Template

    Shower Presentation Template

    Shower HTML presentation engine

    Shower Presentation Template is a shower HTML presentation engine. Built on HTML, CSS and vanilla JavaScript, works in all modern browsers. Themes are separated from engine, and comes with fully keyboard accessible. Printable to PDF and includes Ribbon and Material themes, and core with plugins. You’ll need Node.js installed on your computer. Latest stable versions of Chrome, Edge, Firefox, and Safari are supported.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Ray Tracing in One Weekend Book Series

    Ray Tracing in One Weekend Book Series

    The Ray Tracing in One Weekend series of books

    The Ray Tracing in One Weekend series of books are now available to the public for free online. They are now released under the CC0 license. This means that they are as close to public domain as we can get. (While that also frees you from the requirement of providing attribution, it would help the overall project if you could point back to this web site as a service to other users.) These books are formatted for printing directly from your browser, where you can also (on most browsers) save...
    Downloads: 7 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 5
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    html-pdf-chrome

    html-pdf-chrome

    HTML to PDF or image (jpeg, png, webp) converter via Chrome/Chromium

    HTML to PDF or image (jpeg, png, webp) converter via Chrome/Chromium. This library is NOT meant to accept untrusted user input. Doing so may have serious security risks such as Server-Side Request Forgery (SSRF). If you run into CORS issues, try using the --disable-web-security Chrome flag, either when you start Chrome externally, or in options.chromeFlags. This option should only be used if you fully trust the code you are executing during a print job. It is strongly recommended that you...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Universal Résumé Template

    Universal Résumé Template

    Minimal and formal résumé (CV) website template for print and mobile

    Minimal and formal résumé (CV) website template for print, mobile, and desktop. The proportions are the same on the screen and paper. Built with amazing Tailwind CSS. I couldn’t find any formal or professional résumé (CV) website with good typography that is optimized for the Web, print, PDF, and mobile. Also, researching what recruiters want, my priorities were fast scanning time and all content to fit on one page. Replace every -letter with -a4, and uncomment specified code blocks....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    backslide

    backslide

    CLI tool for making HTML presentations with Remark.js using Markdown

    CLI tool for making HTML presentations with Remark.js using Markdown. Use bs init to create a new presentation along with a template directory in the current directory. The template directory is needed for backslide to transform your Markdown files into HTML presentations. You can create as many markdown presentations as you want in the directory, they will all be based on the same template. Use bs serve to start a development server with live reload. A page will automatically open in your...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Myrtille

    Myrtille

    A native HTML4 / HTML5 Remote Desktop Protocol and SSH client

    Myrtille provides simple and fast access to remote desktops, applications, and SSH servers through a web browser, without any plugin, extension or configuration. Technically, Myrtille is an HTTP(S) to RDP and SSH gateway. User input (keyboard, mouse, touchscreen) is forwarded from a web browser to an HTTP(S) gateway, then up to an RDP (or SSH) client which maintains a session with an RDP (or SSH) server. The display resulting (or not) of such actions is streamed back to the browser, from the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 10
    pdf2htmlEX

    pdf2htmlEX

    Convert PDF to HTML without losing text or format

    pdf2htmlEX renders PDF files in HTML, utilizing modern Web technologies. It aims to provide an accurate rendering, while being optimized for Web display. Text, fonts and formats are natively preserved in HTML. Mathematical formulas, figures and images are also supported. pdf2htmlEX is also a publishing tool: almost 50 options make it flexible for many different use cases: PDF preview, book/magazine publishing, personal resume. pdf2htmlEX is optimized for modern web browsers such as Mozilla...
    Downloads: 29 This Week
    Last Update:
    See Project
  • 11
    wkhtmltopdf

    wkhtmltopdf

    Convert HTML to PDF using Webkit (QtWebKit)

    wkhtmltopdf and wkhtmltoimage are open source (LGPLv3) command line tools to render HTML into PDF and various image formats using the Qt WebKit rendering engine. These run entirely "headless" and do not require a display or display service. There is also a C library, if you're into that kind of thing. The file pdf.h contains a fairly high level and stable pure c binding to wkhtmltopdf. These binding are well documented and do not depend on QT. Using this is the recommended way of interfacing...
    Downloads: 53 This Week
    Last Update:
    See Project
  • 12
    posterdown

    posterdown

    Use RMarkdown to generate PDF Conference Posters via HTML

    Welcome to Posterdown! This is my attempt to provide a semi-smooth workflow for those who wish to take their RMarkdown skills to the conference world. Many creature comforts from RMarkdown are available in this package such as Markdown section notation, figure captioning, and even citations like this one (Allaire, Xie, McPherson, et al. 2018). The rest of this example poster will show how you can insert typical conference poster features into your own document. Posterdown was created as a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    IdeoType is a book compiler that converts manuscript (XHTML) to book (PDF) on the fly.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    DinkToPdf

    DinkToPdf

    C# .NET Core wrapper for wkhtmltopdf library that uses Webkit engine

    .NET Core P/Invoke wrapper for wkhtmltopdf library that uses Webkit engine to convert HTML pages to PDF. Copy the native library to root folder of your project. From there .NET Core loads the native library when the native method is called with P/Invoke. You can find the latest version of the native library. Select the appropriate library for your OS and platform (64 or 32-bit). The library was not tested with IIS. The library was tested in console applications and with Kestrel web server...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Shelk-test
    Open Source program for creating tests, which will be a compile of test and testing. It can be used by anyone who want to quickly create test and make testing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    xccdf2pdf renders XCCDF documents in PDF and other formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    APHID is an easy-to-install, easy-to-use DocBook environment. APHID transforms source documents (text or XML) into multiple output formats (HTML, PDF, HTML Help, etc.). APHID is a derivative work of eDE (http://www.e-novative.de).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Converter from FB2 to PDF format. Useful for ebook readers with bad or missing FB2 support.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    openRiverbed - the PHP5 framework. Ajax, TinyMCE, Plugins, XML based configuration, template based, XML2PDF pdf generation, multi-language support for application and content, encrypted sessions, test-driven, oo developed... Hardened by real projects.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    BTL is a template language that combines power of JSTL and XSLT to produce documents in XML, HTML, XHTML, XSL-FO, PDF or other formats, based on the JavaBean input.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Visual xsltproc is a tool which help to write xslt file, and debug it to find errors. It writes xml, and generates xml (Syntax highlighting of XML & line Nr.). Finally if the result is XSL-FO it generates the pdf on Apache FOP java. Build on QT4.2.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    dompdf - the PHP 5 HTML to PDF converter. dompdf is a (mostly) CSS compliant HTML rendering engine written in PHP. It supports external stylesheets, inline style tags, and the style attributes of individual HTML elements. Requires PHP 5.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    The Nheengatu Project is a Java library that provides HTML markup abstraction allowing you to reutilize it to generate PDF files, OpenOffice documents, image files, etc. The goal of this project is to maximize the use of HTML markup procedures.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Connla is a Java library for creating data collections which can be exported to TXT, CSV, HTML, XHTML, XML, PDF and XLS formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next