Showing 22 open source projects for "pdf data mining"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    Dompdf

    Dompdf

    HTML to PDF converter for PHP

    dompdf is an HTML to PDF converter. At its heart, dompdf is (mostly) a CSS 2.1 compliant HTML layout and rendering engine written in PHP. It is a style-driven renderer, it will download and read external stylesheets, inline style tags, and the style attributes of individual HTML elements. It also supports most presentational HTML attributes. PDF rendering is currently provided either by PDFLib or by a bundled version the R&OS CPDF class written by Wayne Munro. (Some important changes have...
    Downloads: 115 This Week
    Last Update:
    See Project
  • 2
    iLovePDF Api

    iLovePDF Api

    iLovePDF Rest Api - PHP Library

    Develop and automate PDF processing tasks like Compress PDF, merging PDF, Split PDF, converting Office to PDF, PDF to JPG, Images to PDF, adding Page Numbers, Rotate PDF, Unlocking PDF, stamping a Watermark, and Repair PDF. Each one with several settings to get your desired results. Strong infrastructure to offer the best-dedicated processing power. You might know us from ilovepdf.com where we process millions of PDFs daily. We offer a simple and concise API Reference and Guide as well as...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 3
    QuestPDF

    QuestPDF

    A library that can help you with generating PDF documents

    Quickly design and generate PDF documents with an open-source, modern, and battle-tested C# library. Forget about limitations, feel confident, enjoy your task and efficiently deliver professional products. QuestPDF is a progressive library that can help you with generating PDF documents in your .NET application by offering a friendly, discoverable and predictable C# fluent API. Do you believe that creating a complete invoice document can take less than 200 lines of code? We have prepared for...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    borb

    borb

    borb is a library for reading, creating and manipulating PDF files

    borb is a library for creating and manipulating PDF files in python. borb is a pure python library to read, write, and manipulate PDF documents. It represents a PDF document as a JSON-like data structure of nested lists, dictionaries and primitives (numbers, string, booleans, etc) This is currently a one-man project, so the focus will always be to support those use-cases that are more common in favor of those that are rare.
    Downloads: 8 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 5
    MinerU

    MinerU

    A high-quality tool for convert PDF to Markdown and JSON

    MinerU is an open-source, high-quality document extraction toolkit focused on converting PDFs (and other document formats) into structured Markdown and JSON. It leverages OCR and layout analysis to preserve semantic structure and metadata, ideal for research and data science workflows.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 6
    Gerber2PDF

    Gerber2PDF

    Gerber to PDF converter

    Gerber2PDF is a command-line tool to convert Gerber files to PDF for proofing and hobbyist printing purposes. It converts multiple Gerber files at once, placing the resulting layers each on it's own page within the PDF. Each layer has a PDF bookmark for easy reference. Layers can optionally be combined onto a single page and rendered with custom colours and transparency. There is a Drill to Gerber converter available from the downloads page.
    Leader badge
    Downloads: 19 This Week
    Last Update:
    See Project
  • 7

    toPDF

    Online service for PDF conversion (to PDF)

    A simple online service for PDF conversion. This project is a simple library and also a web application. It offers a REST service and a simple upload service for synchronous conversion. This library/application doesn't contain conversion libraries because it's a wrapper for existing tools. toPDF currently supports the open source tool PDF Creator (http://www.pdfforge.org) and the commercial solution, easy PDF, from BCL (http://www.pdfonline.com/easypdf/sdk/).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    workerPdf

    workerPdf

    WorkerPDF is GUI for GhostScript created for PDF conversion

    WorkerPDF uses GhostScript https://www.ghostscript.com/. WorkerPDF created for PDF conversion. Program features: - Compress pdf documents; - Combine pdf; - Moving pdf pages; - Rotating pdf pages; - Creating pdf from images; - Convert pdf to images. - Encrypt, decrypt pdf WorkerPDF использует GhostScript https://www.ghostscript.com/. WorkerPDF создан для преобразования PDF. Возможности программы: - Сжатие pdf документов; - Объединение pdf; - Перестановка страниц...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    pdf combiner merger converter splitter

    pdf combiner merger converter splitter

    PDF Combiner is a user-friendly, GUI-based tool built in

    PDF Combiner is a user-friendly open source free to use, GUI-based tool for combining, pdf to excel, pdf to word, image to pdf, zip, unzip annotate and splitting PDF files. It is easy to use, supports multiple file insert and delete and process, and allows you to adjust the order of files before combining.
    Downloads: 3 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 10

    pdf2txt

    Script to convert PDF files to TXT files

    Two scripts using calibre and poppler to convert PDF files to TXT (plain text) files. Only use PDF files without spaces in the name. Dois scripts que usam calibre e poppler para converter arquivos PDF para arquivos TXT (texto simples). Apenas usem arquivos PDF sem espaços no nome.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    node-html-pdf

    node-html-pdf

    HTML to PDF converter that uses phantomjs

    HTML to PDF converter that uses phantomjs. html-pdf can read the header or footer either out of the footer and header config object or out of the HTML source. You can either set a default header & footer or overwrite that by appending a page number (1 based index) to the id="pageHeader" attribute of an HTML tag. You can use any combination of those tags. The library tries to find any element, that contains the page header or pageFooter id prefix. The full options object gets converted to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    PDF2EpubMaker

    Convert PDF to epub by OCR

    Qt Application to convert PDF in EPub format with several step : - convert PDF to png with libpoppler - convert pnf to txt by libtesseract - suppress hyphenate - spell checkinng
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Djvu-Spec Pdf 2 Djvu Converter

    Djvu-Spec Pdf 2 Djvu Converter

    Convert pdf to a djvu using profiles with all options of pdf2djvu

    DjVu is a good format to distribute documents and books. DjVu need no fonts, support text layer and outline. With Djvu-Spec Pdf2Djvu Converter you can easy convert pdf to djvu. Portable version - link "Files"
    Leader badge
    Downloads: 38 This Week
    Last Update:
    See Project
  • 14
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    PDF Version Converter

    Convert PDF versions, so old software can still be used.

    This a GUI for calling Ghostscript to change PDF versions. If you have older software that needs PDF files in say 1.4 format, but your PDF file is 1.6, this is your answer. Select your file, choose which version and convert it. Requires Gnome 2.22.3 or better*, GTK 2.0 and of course Ghostscript. * I haven't tested on newer version of Gnome.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    an images to pdf converter
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    A simple PDF splitter that uses PDFSharp. You can split, merge, create or convert PDF files to text. Passing meta-data to newly created chunks is possible. Naming options, like adding date or adding index number are available.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Converter from FB2 to PDF format. Useful for ebook readers with bad or missing FB2 support.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    HTML/XML/CSS to PDF converter using the reportlab toolkit and pyPDF. Many features, easy to handle and to extend.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    dompdf - the PHP 5 HTML to PDF converter. dompdf is a (mostly) CSS compliant HTML rendering engine written in PHP. It supports external stylesheets, inline style tags, and the style attributes of individual HTML elements. Requires PHP 5.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    Java based tool to convert HTML/DHTM to PDF document.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    This is a tool to convert pdf files to html/text files and extract images.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo