Showing 77 open source projects for "pdf data mining"

View related business solutions
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • 1

    PoDoFo

    A PDF parsing, modification and creation library.

    The PoDoFo library is a free, portable C++ library. It can parse and modify existing PDF files and create new ones from scratch. It also includes several tools to work with PDF files. It features an unique approach which provides access to PDF documents via an object tree. Therefore, PDFs can be created and or manipulated using a simple tree structure. Development of PoDoFo has been moved to GitHub: https://github.com/podofo/podofo Please raise new issues in the GitHub project.
    Leader badge
    Downloads: 115 This Week
    Last Update:
    See Project
  • 2
    Tile Pattern Exporter

    Tile Pattern Exporter

    Tile large format PNG patterns into print-at-home PDF pages

    You can tile large format PNG patterns into print-at-home PDF pages. Created for LearnMYOG. This set of scripts automates the tiling of large format PNG files into letter(A4), tabloid(A3), and A0 sized PDF pages with print margins, alignment and cut guides, page numbers, and a copyright stamp to each page. For best results, input an exported PNG with size in multiples of 7.5 inches wide and 10 inches tall @ 300dpi.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3

    JosePythonApps

    Here are my python scripts written until now

    Here are my python scripts. They are humble but easy to use and, may be you'll find them useful.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    MagicalPdfEditor

    MagicalPdfEditor

    This is a small PDF editor based on OpenPdf core and AndroidPdfViewer

    This is a small PDF editor based on OpenPDF Core and AndroidPdfViewer. As there is not many open-source easy working PDF editors and PDF wizards, I decided to create a simple directory to resolve my issues. Here I have worked on two separate cores, add some functionality to them, and combined them together to achieve my target. I am working on this repo, any help will be appreciated. Just clone the project and trance the source code, It's really easy and clear. All functionality in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • 5
    Free editor for PDF documents. Complete editing of PDF documents is possible with PDFedit. You can change raw pdf objects (for advanced users) or use many gui functions. Functionality can be easily extended using a scripting language (ECMAScript).
    Leader badge
    Downloads: 138 This Week
    Last Update:
    See Project
  • 6
    PHP Pdf creation - R&OS
    MOVED TO GITHUB https://github.com/ole1986/pdf-php
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    Prawn

    Prawn

    Fast, Nimble PDF Writer for Ruby

    Prawn is a pure Ruby PDF generation library that provides a lot of great functionality while trying to remain simple and reasonably performant. Extensive text rendering support, including flowing text and limited inline formatting options. Comprehensive internationalization features, including full support for UTF-8 based fonts, right-to-left text rendering, fallback font support, and extension points for customizable text wrapping. Support for PDF outlines for document navigation. Low level...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8

    Encrypt PDF

    Add password to PDF file

    Adds a password to a PDF file *Requires .Net 3.5 and above Limitation *1 PDF at a time
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    PyPDFConvert

    PyPDFConvert

    PyPDFConverter is program to convert Word files and images to pdf

    Downloads: 1 This Week
    Last Update:
    See Project
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 10
    PDF Guru

    PDF Guru

    Merge images and PDFs to a single PDF

    PDF Guru is a simple in use program for merging multiple images and PDF files into a single compact PDF file. It is capable of selecting specific PDF pages or range of pages, which lets you have more control on the output file. Be able to produce compacted, smaller sized files in any operating system. Its features makes it a great, must have, tool for everyone.
    Leader badge
    Downloads: 11 This Week
    Last Update:
    See Project
  • 11
    PdfJumbler
    A simple tool to rearrange/merge/delete pages from PDF files. The modular backend system uses either JPedal or JPod to display PDFs and iText or Apache PDFBox to save them. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Development of this project has moved to GitHub. Please check https://github.com/mgropp/pdfjumbler for current releases! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    Downloads: 12 This Week
    Last Update:
    See Project
  • 12
    wkhtmltopdf

    wkhtmltopdf

    Convert HTML to PDF using Webkit (QtWebKit)

    wkhtmltopdf and wkhtmltoimage are open source (LGPLv3) command line tools to render HTML into PDF and various image formats using the Qt WebKit rendering engine. These run entirely "headless" and do not require a display or display service. There is also a C library, if you're into that kind of thing. The file pdf.h contains a fairly high level and stable pure c binding to wkhtmltopdf. These binding are well documented and do not depend on QT. Using this is the recommended way of interfacing...
    Downloads: 51 This Week
    Last Update:
    See Project
  • 13

    pdforganiser

    Manage your collection of PDF files

    PDF Organiser gathers PDF files distributed throughout your system into one place where you can organise them into folders of your choosing. It has minimal dependencies of gtk and json-glib so it is very lightweight and fast with a binary size of less than 45 Kb.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    PDF-Unlock

    PDF-Unlock

    Small tool that utilizes GhostScript to unlock a protected PDF

    https://github.com/Go2Engle/PDF-Unlock Small tool that utilizes GhostScript to unlock a protected PDF For this application to function you will need to install Ghostscript 64 bit. You can download and install Ghostscript from the link below. https://www.ghostscript.com/download/gsdnld.html
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15

    pyPRN2PDF

    Convert files with output for dot-matrix printer into PDF

    Output from old DOS program intended to be printed on ESC/P2 dot-matrix printer is converted into PDF. Emulate all print codes I needed for document conversion with some nice additional features.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    Merge PDF Files

    It is a Windows library that merges standard PDFs into a final PDF

    The library is intended for developers, for inclusion in desktop applications or server services. There are lots of SDKs on the market creating (merging) PDFs (almost all of them have limitations). Our Windows library (MergePDFByNMI.dll) only merges standard PDF files (there are several PDF formats). You can send the input PDFs (by file name or by byte array) and you can have the final PDF (saved on a file or get back on a byte array). The library calls can be synchronous...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    jpeg2pdf

    Create PDF from JPEG scans and photos

    Cross-platform command-line tool for creation of PDF documents from scans/photos of pages in JPEG (.jpg) format and the lightest weight ANSI C library to put multiple JPEG files into one PDF file. You can add handwritten comments to PDF scans (over original images) with xournal: http://xournal.sourceforge.net/ It supports graphics tablets and saves comments to PDFs as vector data.
    Leader badge
    Downloads: 29 This Week
    Last Update:
    See Project
  • 18
    TCPDF - PHP class for PDF

    TCPDF - PHP class for PDF

    PHP class for PDF

    TCPDF is a PHP class for generating PDF documents without requiring external extensions. TCPDF Supports UTF-8, Unicode, RTL languages, XHTML, Javascript, digital signatures, barcodes and much more. IMPORTANT: This version will be soon marked as deprecated and replaced by a new version currently under development: https://github.com/tecnickcom/tc-lib-pdf
    Leader badge
    Downloads: 149 This Week
    Last Update:
    See Project
  • 19
    PDF Merge and Edit

    PDF Merge and Edit

    Python script to merge and edit sensitive PDF files

    Python script to merge and edit sensitive PDF files you don't want to upload to random sites you find on Google. Merge PDFs by adding one to another. Update a single page in a PDF (good for adding a signed page to a form) Insert a page into an existing PDF. Delete a page. Click on one of the buttons and a new window will pop up depending on the function. Pick your files and enter in the data.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    PDF API HTML5 Web Apps

    PDF API HTML5 Web Apps

    Mini SDK JavaScript API library PDF web apps

    A condensed library designed to web modern applications, to quickly export your content html to pdf thanks the famous library in javascript: jsPDF. And a special thanks to the project canvg and html2canvas. Project documentation: http://ulmdevice.altervista.org/pdfapihtml5/#documentation ========== Also available service for Angular 7+: http://ulmdevice.altervista.org/pdfjsapi/ Mobile Applications: http://bit.ly/1MrlgKk Opera add-on: http://bit.ly/1kkMhTa
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    PDFManager

    PDFManager

    Application to modify pdf files.

    Java application to modify PDF files. It may be used to merge two or more PDF files, delete pages, change page order, etc. Windows installer and source code to download. ------------------------------------- Version 2.0: You may create PDF files from *txt files; You may create PDF files from pictures; You may choose the order of the pages when merging files; New Interface; Available for Linux; -------------------------------------
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    TEAPDF2Text

    TEAPDF2Text

    Pdf to text class for parsing law referential

    Pdf to text class for parsing law referential create for two columns law ticket #1 in the project jsph.fr .
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    pdf-bot

    pdf-bot

    A Node queue API for generating PDFs using headless Chrome

    pdf-bot is a Node.js microservice designed to automate the generation of PDF documents from web pages using headless Chrome. The project provides a queue-based API that allows developers to submit URLs for PDF generation, which are then processed asynchronously by the service. Once a document is generated, the system can notify external applications through webhooks, enabling integration with other backend systems or automation pipelines. The service is particularly useful for generating...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    PDF-Shuffler
    PDF-Shuffler is a small python-gtk application, which helps the user to merge or split pdf documents and rotate, crop and rearrange their pages using an interactive and intuitive graphical interface. It is a frontend for python-pyPdf.
    Downloads: 34 This Week
    Last Update:
    See Project
  • 25
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. The latest...
    Leader badge
    Downloads: 114 This Week
    Last Update:
    See Project
Auth0 Logo