Showing 13 open source projects for "python pdf scaper"

View related business solutions
  • 99.99% Uptime for MySQL and PostgreSQL on Google Cloud Icon
    99.99% Uptime for MySQL and PostgreSQL on Google Cloud

    Enterprise Plus edition delivers sub-second maintenance downtime and 2x read/write performance. Built for critical apps.

    Cloud SQL Enterprise Plus gives you a 99.99% availability SLA with near-zero downtime maintenance—typically under 10 seconds. Get 2x better read/write performance, intelligent data caching, and 35 days of point-in-time recovery. Supports MySQL, PostgreSQL, and SQL Server with built-in vector search for gen AI apps. New customers get $300 in free credit.
    Try Cloud SQL Free
  • Cut Cloud Costs with Google Compute Engine Icon
    Cut Cloud Costs with Google Compute Engine

    Save up to 91% with Spot VMs and get automatic sustained-use discounts. One free VM per month, plus $300 in credits.

    Save on compute costs with Compute Engine. Reduce your batch jobs and workload bill 60-91% with Spot VMs. Compute Engine's committed use offers customers up to 70% savings through sustained use discounts. Plus, you get one free e2-micro VM monthly and $300 credit to start.
    Try Compute Engine
  • 1
    py-pdf-parser

    py-pdf-parser

    A Python tool to help extracting information from structured PDFs

    py-pdf-parser is a Python tool designed to help extract information from structured PDFs. It provides a simple interface to define parsing rules and extract data from PDF documents. ​
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Nano PDF Editor

    Nano PDF Editor

    Edit PDF files with Nano Banana

    Nano PDF Editor is a minimalist, portable PDF viewer and toolkit that focuses on simplicity, speed, and ease of integration for applications that need basic PDF rendering without heavy dependencies. It provides core functionality such as page navigation, zooming, text selection, and rendering directly to native graphics surfaces, making it suitable for lightweight PDF viewing scenarios on desktop or embedded platforms. Designed to be easily embedded into larger software projects, Nano-PDF...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 3
    pdfly

    pdfly

    CLI tool to extract (meta)data from PDF and manipulate PDF files

    A Python library designed for manipulating PDF files with functionalities for extraction, transformation, and document generation.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    OCRmyPDF

    OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files

    OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.
    Downloads: 122 This Week
    Last Update:
    See Project
  • Cut Data Warehouse Costs up to 54% with BigQuery Icon
    Cut Data Warehouse Costs up to 54% with BigQuery

    Migrate from Snowflake, Databricks, or Redshift with free migration tools. Exabyte scale without the Exabyte price.

    BigQuery delivers up to 54% lower TCO than cloud alternatives. Migrate from legacy or competing warehouses using free BigQuery Migration Service with automated SQL translation. Get serverless scale with no infrastructure to manage, compressed storage, and flexible pricing—pay per query or commit for deeper discounts. New customers get $300 in free credit.
    Try BigQuery Free
  • 5
    Unredact

    Unredact

    A simple tool for reading in poorly redacted documents

    Unredact is a specialized tool that attempts to reconstruct redacted or obscured text in images, PDFs, or screenshots using a combination of image processing and generative AI inference to suggest plausible completions of blurred, black-boxed, or jumbled content. Unlike traditional optical character recognition (OCR), which only reads visible text, Unredact focuses on inferring missing content where redaction has been applied by analyzing surrounding context, font characteristics, and...
    Downloads: 38 This Week
    Last Update:
    See Project
  • 6
    PdfBooklet
    PdfBooklet is a Python Gtk application which allows to make books or booklets from existing pdf files. It can also adjust margins, rotate, scale, merge files or extract pages.
    Leader badge
    Downloads: 199 This Week
    Last Update:
    See Project
  • 7

    realwatermark

    A Python application to add watermarks (text or image) to PDF files

    A Python application to add watermarks (text or image) to PDF files, converts them into image and back to PDF with options for OCR and compression.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    HornPenguin Booklet

    HornPenguin Booklet

    Booklet, Signature generator, Imposition

    HornPenguin Booklet is a simple software that generates booklet and signature for bookbinding from your pdf files. You can print your own book signatures and simple pamplet with your home printer. Support diffence signature size from 4 to 32. Change page size during generating signature. Left riffling direction is supported for old asian bookbinding. Imposition routines for rearranged manuscripts
    Downloads: 32 This Week
    Last Update:
    See Project
  • 9
    pdf-editor

    pdf-editor

    Edit your PDFs without needing a subscription or creating accounts

    Edit your PDFs without needing a subscription or creating accounts. Add a GUI/Turn it into a web application. Add a parser for the command line to do multiple commands at once e.g. merge (cut pdf1) pdf2. Tested working with Python 3.8.5. Install venv (py -3.8 -m pip install virtualenv). PDF and Word documents are binary files, which makes them much more complex than plaintext files. In addition to text, they store lots of font, color, and layout information. If you want your programs to read or write to PDFs or Word documents, you’ll need to do more than simply pass their filenames to open().
    Downloads: 0 This Week
    Last Update:
    See Project
  • Deploy Apps in Seconds with Cloud Run Icon
    Deploy Apps in Seconds with Cloud Run

    Host and run your applications without the need to manage infrastructure. Scales up from and down to zero automatically.

    Cloud Run is the fastest way to deploy containerized apps. Push your code in Go, Python, Node.js, Java, or any language and Cloud Run builds and deploys it automatically. Get fast autoscaling, pay only when your code runs, and skip the infrastructure headaches. Two million requests free per month. And new customers get $300 in free credit.
    Try Cloud Run Free
  • 10
    Tile Pattern Exporter

    Tile Pattern Exporter

    Tile large format PNG patterns into print-at-home PDF pages

    You can tile large format PNG patterns into print-at-home PDF pages. Created for LearnMYOG. This set of scripts automates the tiling of large format PNG files into letter(A4), tabloid(A3), and A0 sized PDF pages with print margins, alignment and cut guides, page numbers, and a copyright stamp to each page. For best results, input an exported PNG with size in multiples of 7.5 inches wide and 10 inches tall @ 300dpi.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    JosePythonApps

    Here are my python scripts written until now

    Here are my python scripts. They are humble but easy to use and, may be you'll find them useful.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    PDF Merge and Edit

    PDF Merge and Edit

    Python script to merge and edit sensitive PDF files

    Python script to merge and edit sensitive PDF files you don't want to upload to random sites you find on Google. Merge PDFs by adding one to another. Update a single page in a PDF (good for adding a signed page to a form) Insert a page into an existing PDF. Delete a page. Click on one of the buttons and a new window will pop up depending on the function.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    PDF-Shuffler
    PDF-Shuffler is a small python-gtk application, which helps the user to merge or split pdf documents and rotate, crop and rearrange their pages using an interactive and intuitive graphical interface. It is a frontend for python-pyPdf.
    Downloads: 57 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB
Gen AI apps are built with MongoDB Atlas
Atlas offers built-in vector search and global availability across 125+ regions. Start building AI apps faster, all in one place.
Try Free →