Showing 511 open source projects for "pdf tool"

View related business solutions
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 1
    Stirling-PDF

    Stirling-PDF

    Web application that allows you to perform operations on PDF files

    Stirling PDF is a powerful, locally hosted web-based PDF manipulation tool offering a wide range of editing, conversion, and utility features. It allows users to merge, split, compress, convert, OCR, and perform other operations on PDF files directly from a browser without uploading data to third-party servers. The tool is privacy-conscious, self-hostable via Docker, and built with modularity in mind to allow future expansion and integration.
    Downloads: 29 This Week
    Last Update:
    See Project
  • 2
    Markdown to PDF

    Markdown to PDF

    Hackable CLI tool for converting Markdown files to PDF using Node.js

    A simple and hackable CLI tool for converting markdown to pdf. It uses Marked to convert markdown to HTML and Puppeteer (headless Chromium) to further convert the HTML to PDF. It also uses highlight.js for code highlighting. The whole source code of this tool is only ~250 lines of JS ~500 lines of Typescript and ~100 lines of CSS, so it is easy to clone and customize.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    OpenDataLoader PDF

    OpenDataLoader PDF

    PDF Parser for AI-ready data. Automate PDF accessibility

    OpenDataLoader PDF is an open-source document processing system designed to convert complex PDF files into structured, AI-ready formats such as Markdown, JSON, and HTML while preserving layout, hierarchy, and semantic meaning. It focuses on enabling downstream use cases like retrieval-augmented generation (RAG), knowledge extraction, and document intelligence pipelines by maintaining accurate reading order and spatial metadata through bounding boxes.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    Malicious PDF Generator

    Malicious PDF Generator

    Generate a bunch of malicious pdf files with phone-home functionality

    Generate ten different malicious PDF files with phone-home functionality. Can be used with Burp Collaborator or Interact.sh. Used for penetration testing and/or red-teaming etc. I created this tool because I needed a third-party tool to generate a bunch of PDF files with various links.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 5
    Zotero PDF Translate

    Zotero PDF Translate

    Translate PDF, EPub, webpage, metadata, annotations, notes

    Zotero PDF Translate is a plugin for Zotero that enhances the research workflow by enabling in-app translation of PDFs, EPUBs, webpages, and associated metadata directly within the Zotero interface. It integrates seamlessly with Zotero’s document reader, allowing users to select text and instantly receive translations in a pop-up or side panel without leaving the application.
    Downloads: 18 This Week
    Last Update:
    See Project
  • 6
    py-pdf-parser

    py-pdf-parser

    A Python tool to help extracting information from structured PDFs

    py-pdf-parser is a Python tool designed to help extract information from structured PDFs. It provides a simple interface to define parsing rules and extract data from PDF documents. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    PDF4QT

    PDF4QT

    Open source PDF editor

    PDF4QT is open source PDF editor based on Qt framework. It contains a C++ library, applications for viewing/editing PDF documents, and a command line tool. PDF4QT is an open-source PDF editor for Windows/Linux. It is a modern solution for viewing/editing/rendering PDF documents, for users and developers alike. For developers, there is a C++ library and a command line tool for use in scripts.
    Downloads: 80 This Week
    Last Update:
    See Project
  • 8
    PDFsam

    PDFsam

    PDFsam, a desktop application to split, merge, mix, rotate PDF files

    PDFsam Basic is our free and open-source desktop application to split, merge, extract pages, rotate and mix PDF files. PDFsam Visual is a powerful tool to visually compose PDF files, reorder pages, delete pages, split, merge, rotate, encrypt, decrypt, extract text, convert to grayscale, crop PDF files. PDFsam Basic is written using JavaFX. Since version 4 it is released as a self-contained application and bundles a jlinked JDK while version 3 requires a Java Runtime Environment 8 with JavaFx installed in order to run.
    Downloads: 195 This Week
    Last Update:
    See Project
  • 9
    MinerU

    MinerU

    A high-quality tool for convert PDF to Markdown and JSON

    MinerU is an open-source, high-quality document extraction toolkit focused on converting PDFs (and other document formats) into structured Markdown and JSON. It leverages OCR and layout analysis to preserve semantic structure and metadata, ideal for research and data science workflows.
    Downloads: 9 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 10
    pdfly

    pdfly

    CLI tool to extract (meta)data from PDF and manipulate PDF files

    A Python library designed for manipulating PDF files with functionalities for extraction, transformation, and document generation.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    xhtml2pdf

    xhtml2pdf

    A library for converting HTML into PDFs using ReportLab

    xhtml2pdf enables users to generate PDF documents from HTML content easily and with automated flow control such as pagination and keeping text together. The Python module can be used in any Python environment, including Django. The Command line tool is a stand-alone program that can be executed from the command line.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Pandoc

    Pandoc

    The universal markup converter

    Pandoc is a universal document converter able to convert files from a multitude of markup formats into another. With Pandoc, you have a swiss-army knife of a converter, able to convert practically any markup format into any other. Pandoc contains a Haskell library for conversions as well as a command-line tool that uses this library. It can convert to and from just about anything-- lightweight markup formats, HTML formats, documentation formats, ebooks, TeX formats, word processor formats...
    Downloads: 251 This Week
    Last Update:
    See Project
  • 13
    Blackbird

    Blackbird

    OSINT tool for finding accounts across 600+ sites by username or email

    ...The tool operates primarily through a command line interface, allowing users to run automated searches and gather results from many platforms in a single process. Blackbird also includes an optional AI-powered profiling feature that analyzes discovered sites to generate behavioral and technical insights about a user’s online presence. Results from searches can be exported in formats such as PDF, CSV, or JSON for documentation or reporting purposes.
    Downloads: 29 This Week
    Last Update:
    See Project
  • 14
    Gotenberg

    Gotenberg

    A Docker-powered stateless API for PDF files

    Gotenberg provides a developer-friendly API to interact with powerful tools like Chromium and LibreOffice for converting numerous document formats (HTML, Markdown, Word, Excel, etc.) into PDF files, and more! Thanks to Docker, you don't have to install each tool in your environments; drop the Docker image in your stack, and you're good to go! The webhook feature allows you to upload the output file to the destination of your choice. There are many options to fit your requirements, from the custom HTTP headers sent to your webhook to the HTTP method used to call it. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    MarkPDFDown

    MarkPDFDown

    A high-quality PDF to Markdown tool based on large language model

    MarkPDFdown is an open-source document processing tool designed to convert PDF files into structured Markdown output that can be easily used for documentation, content pipelines, and AI processing workflows. The project focuses on extracting text, formatting, and structural information from complex PDF documents and transforming that information into clean Markdown that preserves the original hierarchy of headings, paragraphs, tables, and lists.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    GROBID

    GROBID

    A machine learning software for extracting information

    GROBID is a machine learning library for extracting, parsing, and re-structuring raw documents such as PDF into structured XML/TEI encoded documents with a particular focus on technical and scientific publications. First developments started in 2008 as a hobby. In 2011 the tool has been made available in open source. Work on GROBID has been steady as a side project since the beginning and is expected to continue as such. Header extraction and parsing from article in PDF format. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Toonily Downloader

    Toonily Downloader

    A python tool for downloading manga from Toonily

    ...It uses concurrent downloading techniques to significantly speed up the process and includes robust error handling to recover from interruptions or failed downloads. Additionally, the tool allows users to convert downloaded chapters into high-quality PDF files without re-encoding images, ensuring fidelity to the original content.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 18
    DocuSeal

    DocuSeal

    Open source DocuSign alternative

    Open source document filling and signing. DocuSeal is an open-source platform that provides secure and efficient digital document signing and processing. Create PDF forms to have them filled and signed online on any device with an easy-to-use, mobile-optimized web tool. Use embeddable code snippets to seamlessly implement the document signing workflows directly on your website or app. Build fillable document forms using our pixel-perfect HTML API, reducing the time for creating personalized documents. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 19
    Unredact

    Unredact

    A simple tool for reading in poorly redacted documents

    Unredact is a specialized tool that attempts to reconstruct redacted or obscured text in images, PDFs, or screenshots using a combination of image processing and generative AI inference to suggest plausible completions of blurred, black-boxed, or jumbled content. Unlike traditional optical character recognition (OCR), which only reads visible text, Unredact focuses on inferring missing content where redaction has been applied by analyzing surrounding context, font characteristics, and...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 20
    Stirling-PDF

    Stirling-PDF

    #1 Locally hosted web application that allows you to work on PDFs

    This is a robust, locally hosted web-based PDF manipulation tool using Docker. It enables you to carry out various operations on PDF files, including splitting, merging, converting, reorganizing, adding images, rotating, compressing, and more. This locally hosted web application has evolved to encompass a comprehensive set of features, addressing all your PDF requirements.
    Leader badge
    Downloads: 64 This Week
    Last Update:
    See Project
  • 21
    PDF-utility

    PDF-utility

    PDF Utility is a tool designed to efficiently manipulate PDF files

    Digna PDF Utility is a tool designed to efficiently manipulate PDF documents. It offers a range of functionalities including adding page numbers, deleting unwanted pages, merging multiple PDFs into a single file, converting PDF to DOCX and vice versa, protect a PDF file with password and displaying PDF content.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 22
    Umi-OCR

    Umi-OCR

    OCR software, free and offline

    Umi-OCR is a free and open-source optical character recognition (OCR) tool designed to provide fast, offline text extraction from images, screenshots, PDFs, and more without requiring a network connection. It includes a highly efficient offline OCR engine with built-in multilingual recognition libraries, so users can extract text across multiple languages with high accuracy directly on their machines. The software supports flexible usage patterns including screenshot capture OCR, batch processing of large sets of images or documents, PDF parsing, QR code detection, and layout-aware paragraph output. ...
    Downloads: 54 This Week
    Last Update:
    See Project
  • 23
    WeebCentral Downloader

    WeebCentral Downloader

    A powerful manga downloader for WeebCentral with both GUI and CLI

    ...The software includes a visually distinctive GUI built with PyQt6, featuring a modern design system and interactive components for managing downloads and viewing manga information. Users can select specific chapters, adjust download speed, and configure output formats such as PDF or CBZ, making it adaptable to different reading preferences. The tool also incorporates progress tracking and background worker threads to ensure a responsive experience during large downloads. Its modular structure separates scraping logic, interface components, and configuration management, making it maintainable and extensible.
    Downloads: 32 This Week
    Last Update:
    See Project
  • 24
    MetaScreener

    MetaScreener

    AI-powered tool for efficient abstract and PDF screening

    ...The platform can analyze both abstracts and full PDF documents, enabling automated filtering based on research criteria defined by the user. By incorporating natural language processing techniques, the system can identify potentially relevant studies and reduce the workload associated with manual screening.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Everything cURL

    Everything cURL

    The book documenting the curl project, the curl tool, libcurl

    Everything curl is an extensive, continuously maintained book that documents the entire curl ecosystem: the curl command-line tool, the libcurl library, the project’s history and development practices, and practical guidance for using and contributing to curl. The project is written as an open source book (CC-BY-4.0) and is available in multiple formats and locations, including an online website, PDF, and ePub so readers can pick the format that suits them. Content ranges from beginner-friendly tutorials and usage examples to deep dives into internals, protocols, bindings, build instructions, and advanced deployment scenarios, making the book useful for both casual users and experienced developers. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB