Showing 74 open source projects for "pdf data mining"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Stirling-PDF

    Stirling-PDF

    Web application that allows you to perform operations on PDF files

    Stirling PDF is a powerful, locally hosted web-based PDF manipulation tool offering a wide range of editing, conversion, and utility features. It allows users to merge, split, compress, convert, OCR, and perform other operations on PDF files directly from a browser without uploading data to third-party servers. The tool is privacy-conscious, self-hostable via Docker, and built with modularity in mind to allow future expansion and integration.
    Downloads: 28 This Week
    Last Update:
    See Project
  • 2
    PDF.js

    PDF.js

    A PDF Reader in JavaScript

    PDF.js is a web standards-based platform for parsing and rendering Portable Document Formats (PDFs). Open source and built with HTML5, this PDF viewer is supported by a great community and Mozilla Labs. PDF.js can be used on both modern and older browsers, and is built into version 19+ of Firefox.
    Downloads: 81 This Week
    Last Update:
    See Project
  • 3
    Dompdf

    Dompdf

    HTML to PDF converter for PHP

    dompdf is an HTML to PDF converter. At its heart, dompdf is (mostly) a CSS 2.1 compliant HTML layout and rendering engine written in PHP. It is a style-driven renderer, it will download and read external stylesheets, inline style tags, and the style attributes of individual HTML elements. It also supports most presentational HTML attributes. PDF rendering is currently provided either by PDFLib or by a bundled version the R&OS CPDF class written by Wayne Munro. (Some important changes have...
    Downloads: 119 This Week
    Last Update:
    See Project
  • 4
    iLovePDF Api

    iLovePDF Api

    iLovePDF Rest Api - PHP Library

    Develop and automate PDF processing tasks like Compress PDF, merging PDF, Split PDF, converting Office to PDF, PDF to JPG, Images to PDF, adding Page Numbers, Rotate PDF, Unlocking PDF, stamping a Watermark, and Repair PDF. Each one with several settings to get your desired results. Strong infrastructure to offer the best-dedicated processing power. You might know us from ilovepdf.com where we process millions of PDFs daily. We offer a simple and concise API Reference and Guide as well as...
    Downloads: 14 This Week
    Last Update:
    See Project
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 5
    TeXworks

    TeXworks

    A simple interface for working with TeX documents

    TeXworks is a free and simple working environment for authoring TeX (LaTeX, ConTeXt and XeTeX) documents. Inspired by Dick Koch's award-winning TeXShop program for Mac OS X, it makes entry into the TeX world easier for those using desktop operating systems other than OS X. It provides an integrated, easy-to-use environment for users on other platforms particularly GNU/Linux and Windows and features a clean, simple interface accessible to casual and non-technical users.
    Downloads: 77 This Week
    Last Update:
    See Project
  • 6
    PDF4QT

    PDF4QT

    Open source PDF editor

    PDF4QT is open source PDF editor based on Qt framework. It contains a C++ library, applications for viewing/editing PDF documents, and a command line tool. PDF4QT is an open-source PDF editor for Windows/Linux. It is a modern solution for viewing/editing/rendering PDF documents, for users and developers alike. For developers, there is a C++ library and a command line tool for use in scripts. For users, there are four applications offering many features. The project is hosted on Github and...
    Downloads: 33 This Week
    Last Update:
    See Project
  • 7
    QuestPDF

    QuestPDF

    A library that can help you with generating PDF documents

    Quickly design and generate PDF documents with an open-source, modern, and battle-tested C# library. Forget about limitations, feel confident, enjoy your task and efficiently deliver professional products. QuestPDF is a progressive library that can help you with generating PDF documents in your .NET application by offering a friendly, discoverable and predictable C# fluent API. Do you believe that creating a complete invoice document can take less than 200 lines of code? We have prepared for...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    borb

    borb

    borb is a library for reading, creating and manipulating PDF files

    borb is a library for creating and manipulating PDF files in python. borb is a pure python library to read, write, and manipulate PDF documents. It represents a PDF document as a JSON-like data structure of nested lists, dictionaries and primitives (numbers, string, booleans, etc) This is currently a one-man project, so the focus will always be to support those use-cases that are more common in favor of those that are rare.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9
    Vanilla.PDF

    Vanilla.PDF

    Cross-platform SDK for creating and modifying PDF documents

    Vanilla.PDF is a modern, high-performance, open-source C++17 SDK designed for creating, editing, signing, and analyzing PDF documents across multiple platforms. It requires no external runtime dependencies, making it lightweight and ideal for embedding into desktop applications, servers, or automation pipelines. The SDK offers full cross-platform support including Windows, Linux, macOS, and Android, with builds available for major compilers and architectures. Vanilla.PDF supports advanced...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 10
    PDFPatcher

    PDFPatcher

    A versatile toolkit for PDF manipulation

    PDFPatcher (aka “PDF补丁丁”) is a versatile toolkit for PDF manipulation—editing document metadata, bookmarks, page layout, content restrictions, rotation, compression, merging/splitting, image extraction, and more, all within an intuitive interface. Merge/split PDFs or images, preserve or add bookmarks, and set page dimensions. Batch style/color/target changes, regex/XPath search/replace, mid‑page positioning. Modify PDF metadata, page numbers, links, initial view mode, and remove open actions.
    Downloads: 36 This Week
    Last Update:
    See Project
  • 11
    MinerU

    MinerU

    A high-quality tool for convert PDF to Markdown and JSON

    MinerU is an open-source, high-quality document extraction toolkit focused on converting PDFs (and other document formats) into structured Markdown and JSON. It leverages OCR and layout analysis to preserve semantic structure and metadata, ideal for research and data science workflows.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 12
    DeckTape

    DeckTape

    PDF exporter for HTML presentations

    DeckTape is a high-quality PDF exporter for HTML presentation frameworks. DeckTape is built on top of Puppeteer which relies on Google Chrome for laying out and rendering Web pages and provides a headless Chrome instance scriptable with a JavaScript API. DeckTape currently supports the following presentation frameworks out of the box. DeckTape also provides a generic command that works by emulating the end-user interaction, allowing it to be used to convert presentations from virtually any...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 13
    Career-Ops

    Career-Ops

    AI-powered job search system built on Claude Code

    Career Ops is an open-source platform designed to help individuals manage their job search process with a structured, operations-style approach that treats career development like a pipeline. It provides a system for organizing job applications, tracking progress across different stages, and maintaining visibility into opportunities, much like a lightweight CRM tailored for job seekers. The project emphasizes clarity and accountability, enabling users to monitor applications, follow-ups, and...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    PDF Split and Merge

    PDF Split and Merge

    Split and merge PDF files on any platform

    Split and merge PDF files with PDFsam, an easy-to-use desktop tool with graphical, command line and web interface.
    Leader badge
    Downloads: 295 This Week
    Last Update:
    See Project
  • 15
    Sprint PDF Editor (Smarter PDF Solution)

    Sprint PDF Editor (Smarter PDF Solution)

    Edit, Convert, Extract , Export, Secure and PDF Imposition.

    Sprint PDF Editor® The Productive, Modern, Innovative, Clean & Colourful GUI. Faster, Smarter & Seamless workflows, with 50+ functions. Sprint PDF Editor & Reader, Complete PDF Solution, Supercharge Your Workflows With Imposition, Extract, Compress, Watermark, Protect & Secure, Split & Merge, Crop Pages, Printing, Stamp & more. Your Privacy, Our Priority Protect Your Data with Complete Confidence.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 16
    Skim

    Skim

    A PDF Reader and Note-taker for OS X

    Skim is a PDF reader and note-taker for OS X. It is designed to help you read and annotate scientific papers in PDF, but is also great for viewing any PDF file. Skim requires OS X 10.10 or higher.
    Leader badge
    Downloads: 10,991 This Week
    Last Update:
    See Project
  • 17
    Tarjamento de Dados Pessoais e Sigilosos

    Tarjamento de Dados Pessoais e Sigilosos

    Ferramenta de Tarjamento de Dados Pessoais e Sigilosos

    TarjaPDF v2.0 Beta — Ferramenta de Tarjamento de Dados Pessoais e Sigilosos Proteja dados sensíveis em PDFs com segurança irreversível. Interface moderna com dark mode, marcação manual (texto, linha e área livre), detecção automática de CPF, RG, e-mail, telefone, nomes próprios e endereços. Escaneamento inteligente com análise preditiva: destaca dados pessoais para revisão antes de tarjar. Detecção de nomes via heurística e base oficial, com dicionário customizável. Relatório de...
    Downloads: 28 This Week
    Last Update:
    See Project
  • 18
    Super PDF Editor (a Batch PDF Processor)

    Super PDF Editor (a Batch PDF Processor)

    Create, Edit, Delete, Organize , Convert, Export, Secure & Sign PDF.

    Super PDF Editor - Powerful, superfast, lightweight PDF processor. All-in-one PDF solution, PDF editing with 80+ tools and functions. The easy-to-use software is complete with editing tools for modifying PDF files your way. Most comprehensive, powerful, process-based and lightning-fast batch processor software. OCR PDF. PDF Imposition, Reverse Pages, Resize Page, Scale Page, Booklet, N-up Pages, Merge, Split by page, Extract Page, Rotate Page. ...
    Leader badge
    Downloads: 7 This Week
    Last Update:
    See Project
  • 19
    fillable-pdf-forms
    A simple, practical tool for creating and working with fillable PDF forms—making it easy to generate, edit, and manage form fields without relying on proprietary software.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    PdfBooklet
    PdfBooklet is a Python Gtk application which allows to make books or booklets from existing pdf files. It can also adjust margins, rotate, scale, merge files or extract pages.
    Leader badge
    Downloads: 215 This Week
    Last Update:
    See Project
  • 21

    PDFTK Builder Enhanced

    Enhanced version of the PDFTK Builder GUI for PDF Toolkit on Windows

    Free and open source GUI application for manipulating PDF files using the Windows version of PDF Toolkit (PDFtk) - split, merge, stamp, number pages, rotate, metadata, bookmarks, attachments, etc. This project is a fork of PDFTK Builder by Angus Johnson that enhances the user interface, adds functions, and enables use of later versions of PDFtk. OS: Windows. Author: David King. License: GPLv3.
    Leader badge
    Downloads: 168 This Week
    Last Update:
    See Project
  • 22
    Gerber2PDF

    Gerber2PDF

    Gerber to PDF converter

    Gerber2PDF is a command-line tool to convert Gerber files to PDF for proofing and hobbyist printing purposes. It converts multiple Gerber files at once, placing the resulting layers each on it's own page within the PDF. Each layer has a PDF bookmark for easy reference. Layers can optionally be combined onto a single page and rendered with custom colours and transparency. There is a Drill to Gerber converter available from the downloads page.
    Leader badge
    Downloads: 14 This Week
    Last Update:
    See Project
  • 23

    toPDF

    Online service for PDF conversion (to PDF)

    A simple online service for PDF conversion. This project is a simple library and also a web application. It offers a REST service and a simple upload service for synchronous conversion. This library/application doesn't contain conversion libraries because it's a wrapper for existing tools. toPDF currently supports the open source tool PDF Creator (http://www.pdfforge.org) and the commercial solution, easy PDF, from BCL (http://www.pdfonline.com/easypdf/sdk/).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    workerPdf

    workerPdf

    WorkerPDF is GUI for GhostScript created for PDF conversion

    WorkerPDF uses GhostScript https://www.ghostscript.com/. WorkerPDF created for PDF conversion. Program features: - Compress pdf documents; - Combine pdf; - Moving pdf pages; - Rotating pdf pages; - Creating pdf from images; - Convert pdf to images. - Encrypt, decrypt pdf WorkerPDF использует GhostScript https://www.ghostscript.com/. WorkerPDF создан для преобразования PDF. Возможности программы: - Сжатие pdf документов; - Объединение pdf; - Перестановка страниц...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    PII-Blackout

    PII-Blackout

    100% offline, AI-powered PDF redaction

    ...PII Blackout automatically scans, detects, and blackouts sensitive data points across your documents in one click. Absolute, Irreversible Security (Image-Level Blackout) Unlike standard PDF editors that merely place a black shape over editable text (which can easily be copied or uncovered), PII Blackout flattens and bakes the redaction directly into the image surface of the document. The covered data is permanently destroyed and mathematically impossible to recover. 100% Offline & Local Processing
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
Auth0 Logo