19 projects for "python pdf scaper" with 2 filters applied:

  • Run Any Workload on Compute Engine VMs Icon
    Run Any Workload on Compute Engine VMs

    From dev environments to AI training, choose preset or custom VMs with 1–96 vCPUs and industry-leading 99.95% uptime SLA.

    Compute Engine delivers high-performance virtual machines for web apps, databases, containers, and AI workloads. Choose from general-purpose, compute-optimized, or GPU/TPU-accelerated machine types—or build custom VMs to match your exact specs. With live migration and automatic failover, your workloads stay online. New customers get $300 in free credits.
    Try Compute Engine
  • Cut Data Warehouse Costs up to 54% with BigQuery Icon
    Cut Data Warehouse Costs up to 54% with BigQuery

    Migrate from Snowflake, Databricks, or Redshift with free migration tools. Exabyte scale without the Exabyte price.

    BigQuery delivers up to 54% lower TCO than cloud alternatives. Migrate from legacy or competing warehouses using free BigQuery Migration Service with automated SQL translation. Get serverless scale with no infrastructure to manage, compressed storage, and flexible pricing—pay per query or commit for deeper discounts. New customers get $300 in free credit.
    Try BigQuery Free
  • 1
    PyPDF

    PyPDF

    A pure-python PDF library capable of splitting, merging, cropping

    pypdf is a pure Python library for working with PDF files, allowing developers to split, merge, rotate, encrypt, and extract content from PDFs. It’s an actively maintained fork of PyPDF2, improving performance, compatibility, and support for modern PDF standards. Suitable for both automation scripts and full-featured applications, pypdf handles PDFs without requiring external dependencies.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    zpdf

    zpdf

    Zero-copy PDF text extraction library written in Zig

    zpdf is a high-performance PDF text extraction library written in Zig that focuses on speed, low overhead, and modern parsing techniques. It leans heavily on memory-mapped file reading and zero-copy patterns where possible, so it can scan large PDFs without repeatedly copying data around in memory. The library supports streaming extraction using efficient arena allocation, making it well suited for workloads that need to process big documents quickly or in batches. It implements multiple PDF...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Pysheeet

    Pysheeet

    Python Cheat Sheet

    Pysheeet is a community-driven collection of Python code snippets covering common patterns and tasks like sockets, file I/O, data structures, and more. Each snippet is concise and battle-tested, designed to save coding time and reduce boilerplate. With documentation hosted on Read the Docs and an active GitHub repo, it’s a go-to resource for Python developers.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    PageIndex

    PageIndex

    Document Index for Vectorless, Reasoning-based RAG

    PageIndex is an innovative open-source framework that reimagines retrieval-augmented generation (RAG) by eliminating conventional vector similarity search and instead building hierarchical semantic indexes that mirror a document’s natural structure. Rather than chunking text and embedding it into a vector database, PageIndex constructs a tree-structured index — similar to a detailed, AI-enhanced table of contents — that a large language model can traverse to locate the most relevant sections...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    Jupyter Notebook Tools for Sphinx

    Jupyter Notebook Tools for Sphinx

    Sphinx source parser for Jupyter notebooks

    nbsphinx is a Sphinx extension that provides a source parser for *.ipynb files. Custom Sphinx directives are used to show Jupyter Notebook code cells (and of course their results) in both HTML and LaTeX output. Un-evaluated notebooks – i.e. notebooks without stored output cells – will be automatically executed during the Sphinx build process.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Libros de Programación en Español

    Libros de Programación en Español

    List of programming books in Spanish for free

    Libros de Programación en Español is a curated list of free programming books in Spanish, organized by topic and technology so learners can find high-quality materials without cost. The README is structured as an index with general programming books, followed by sections for specific languages such as JavaScript, TypeScript, Python, Ruby, Rust, PHP, Haskell, Go, Kotlin, Java, and R.Each entry includes the book title, author, and a link to the official or legal free version (PDF, HTML, eBook, etc.), focusing on resources that are legitimately available. Beyond languages, the list also covers frameworks and libraries (like React and Qwik), tools (such as Git), and databases (SQL), grouping them in separate sections for easier browsing. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Small Python library with various things such as Configuration file parsing (in Python syntax), HTML and PDF parsing. Used in others of my projects.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    DRAKON Editor

    DRAKON Editor

    A free cross-platform editor for the DRAKON visual language.

    DRAKON is a diagram language developed within the Russian space program. Its primary objective is presenting complex software systems in a way which is easy to understand by humans. DRAKON's motto: took a glance - understood at once. DRAKON Editor helps software architects, quality specialists and developers. Architects and quality assurers can express a high-level view of how their product works. DRAKON serves them to explain the dynamics of a software system. Software engineers can use...
    Downloads: 26 This Week
    Last Update:
    See Project
  • 9
    TensorFlow-ZH

    TensorFlow-ZH

    Chinese version of the official document of TensorFlow

    The tensorflow-zh repository is a Chinese translation of the official TensorFlow documentation, organized to make the core guides, tutorials, and reference material accessible to Chinese speakers. It was initiated shortly after TensorFlow’s open-sourcing, with translation and proofreading contributions from a community of volunteers who aimed to bridge the language barrier for learners in China and other Mandarin communities. The repo mirrors the structure of the original English docs:...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Ship AI Apps Faster with Vertex AI Icon
    Ship AI Apps Faster with Vertex AI

    Go from idea to deployed AI app without managing infrastructure. Vertex AI offers one platform for the entire AI development lifecycle.

    Ship AI apps and features faster with Vertex AI—your end-to-end AI platform. Access Gemini 3 and 200+ foundation models, fine-tune for your needs, and deploy with enterprise-grade MLOps. Build chatbots, agents, or custom models. New customers get $300 in free credit.
    Try Vertex AI Free
  • 10
    python library with utility classes for: - access mysql via - nevow / form - mangaing form and new field for form - building pdf report with reportlab
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Python module and command line utility that analyzes XML output from the program pdftohtml in order to extract tables from PDF files. Outputs CSV.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Epydoc is a tool for generating API documentation for Python modules, based on their docstrings. Epydoc supports two output formats (HTML and PDF), and four markup languages for docstrings (Epytext, Javadoc, ReStructuredText, and plaintext).
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    JLink lets users author flow charts based on ISO 5807 and IBM standards. Developers can use JLink to add flowcharts to applications, serve a flow chart over the web in PDF or PNG, or dynamically create a flowchart with Javascript, Python or Ruby scripts
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Python module and program to extract documentation strings from python functions, classes, and methods and transform them into LaTeX, PDF, HTML, or HTB (wxWidgets help viewer) documents. All file distributions contain compiled help files in PDF and HTB.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Library to generate pdf archive contend billet for the net bank Brazilian.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    reppy is a PDF-Report Generator for databases (MySQL, Postgres, CSV) written in Python. The report definition is based on an XML-template, which can be edited with the included program XTRed. It needs the python library reportlab for pdf-creation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Python library and command line tool to generate maps in PDF format an place objects on them.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    OOpyREP is a python code generating filter and library. It reads a OpenOffice.org file and creates a python representation of the document structure as well as contents. The generated code uses the reportlab PDF library to render the document.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    POST (Python Obviously Simple Text) provides support for simple, flexible dynamic document generation in multiple output formats. Supports inputs in text or XML, outputs in HTML, PDF, RTF, LaTeX source, nroff source, postscript, and plain text.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB
Gen AI apps are built with MongoDB Atlas
Atlas offers built-in vector search and global availability across 125+ regions. Start building AI apps faster, all in one place.
Try Free →