14 projects for "pdf data mining" with 2 filters applied:

  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    IdeoType is a book compiler that converts manuscript (XHTML) to book (PDF) on the fly.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Shelk-test
    Open Source program for creating tests, which will be a compile of test and testing. It can be used by anyone who want to quickly create test and make testing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    xccdf2pdf renders XCCDF documents in PDF and other formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 5
    APHID is an easy-to-install, easy-to-use DocBook environment. APHID transforms source documents (text or XML) into multiple output formats (HTML, PDF, HTML Help, etc.). APHID is a derivative work of eDE (http://www.e-novative.de).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Converter from FB2 to PDF format. Useful for ebook readers with bad or missing FB2 support.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    openRiverbed - the PHP5 framework. Ajax, TinyMCE, Plugins, XML based configuration, template based, XML2PDF pdf generation, multi-language support for application and content, encrypted sessions, test-driven, oo developed... Hardened by real projects.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    BTL is a template language that combines power of JSTL and XSLT to produce documents in XML, HTML, XHTML, XSL-FO, PDF or other formats, based on the JavaBean input.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    dompdf - the PHP 5 HTML to PDF converter. dompdf is a (mostly) CSS compliant HTML rendering engine written in PHP. It supports external stylesheets, inline style tags, and the style attributes of individual HTML elements. Requires PHP 5.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 10
    The Nheengatu Project is a Java library that provides HTML markup abstraction allowing you to reutilize it to generate PDF files, OpenOffice documents, image files, etc. The goal of this project is to maximize the use of HTML markup procedures.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Connla is a Java library for creating data collections which can be exported to TXT, CSV, HTML, XHTML, XML, PDF and XLS formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Java based tool to convert HTML/DHTM to PDF document.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    This is a tool to convert pdf files to html/text files and extract images.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    PDML2 is a fork (continuation) of the PDML project - it is an informal markup language written in PHP that is similar to HTML. It allows for the creation of complex PDF documents for use in command line or web applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next