23 projects for "python pdf scaper" with 2 filters applied:

  • $300 in Free Credit for Your Google Cloud Projects Icon
    $300 in Free Credit for Your Google Cloud Projects

    Build, test, and explore on Google Cloud with $300 in free credit. No hidden charges. No surprise bills.

    Launch your next project with $300 in free Google Cloud credit—no hidden charges. Test, build, and deploy without risk. Use your credit across the Google Cloud platform to find what works best for your needs. After your credits are used, continue building with free monthly usage products. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Ship AI Apps Faster with Vertex AI Icon
    Ship AI Apps Faster with Vertex AI

    Go from idea to deployed AI app without managing infrastructure. Vertex AI offers one platform for the entire AI development lifecycle.

    Ship AI apps and features faster with Vertex AI—your end-to-end AI platform. Access Gemini 3 and 200+ foundation models, fine-tune for your needs, and deploy with enterprise-grade MLOps. Build chatbots, agents, or custom models. New customers get $300 in free credit.
    Try Vertex AI Free
  • 1
    PyPDF

    PyPDF

    A pure-python PDF library capable of splitting, merging, cropping

    pypdf is a pure Python library for working with PDF files, allowing developers to split, merge, rotate, encrypt, and extract content from PDFs. It’s an actively maintained fork of PyPDF2, improving performance, compatibility, and support for modern PDF standards. Suitable for both automation scripts and full-featured applications, pypdf handles PDFs without requiring external dependencies.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    zpdf

    zpdf

    Zero-copy PDF text extraction library written in Zig

    zpdf is a high-performance PDF text extraction library written in Zig that focuses on speed, low overhead, and modern parsing techniques. It leans heavily on memory-mapped file reading and zero-copy patterns where possible, so it can scan large PDFs without repeatedly copying data around in memory. The library supports streaming extraction using efficient arena allocation, making it well suited for workloads that need to process big documents quickly or in batches. It implements multiple PDF...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Pysheeet

    Pysheeet

    Python Cheat Sheet

    Pysheeet is a community-driven collection of Python code snippets covering common patterns and tasks like sockets, file I/O, data structures, and more. Each snippet is concise and battle-tested, designed to save coding time and reduce boilerplate. With documentation hosted on Read the Docs and an active GitHub repo, it’s a go-to resource for Python developers.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    PageIndex

    PageIndex

    Document Index for Vectorless, Reasoning-based RAG

    PageIndex is an innovative open-source framework that reimagines retrieval-augmented generation (RAG) by eliminating conventional vector similarity search and instead building hierarchical semantic indexes that mirror a document’s natural structure. Rather than chunking text and embedding it into a vector database, PageIndex constructs a tree-structured index — similar to a detailed, AI-enhanced table of contents — that a large language model can traverse to locate the most relevant sections...
    Downloads: 1 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Jupyter Notebook Tools for Sphinx

    Jupyter Notebook Tools for Sphinx

    Sphinx source parser for Jupyter notebooks

    nbsphinx is a Sphinx extension that provides a source parser for *.ipynb files. Custom Sphinx directives are used to show Jupyter Notebook code cells (and of course their results) in both HTML and LaTeX output. Un-evaluated notebooks – i.e. notebooks without stored output cells – will be automatically executed during the Sphinx build process.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Libros de Programación en Español

    Libros de Programación en Español

    List of programming books in Spanish for free

    Libros de Programación en Español is a curated list of free programming books in Spanish, organized by topic and technology so learners can find high-quality materials without cost. The README is structured as an index with general programming books, followed by sections for specific languages such as JavaScript, TypeScript, Python, Ruby, Rust, PHP, Haskell, Go, Kotlin, Java, and R.Each entry includes the book title, author, and a link to the official or legal free version (PDF, HTML, eBook, etc.), focusing on resources that are legitimately available. Beyond languages, the list also covers frameworks and libraries (like React and Qwik), tools (such as Git), and databases (SQL), grouping them in separate sections for easier browsing. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Small Python library with various things such as Configuration file parsing (in Python syntax), HTML and PDF parsing. Used in others of my projects.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    DRAKON Editor

    DRAKON Editor

    A free cross-platform editor for the DRAKON visual language.

    DRAKON is a diagram language developed within the Russian space program. Its primary objective is presenting complex software systems in a way which is easy to understand by humans. DRAKON's motto: took a glance - understood at once. DRAKON Editor helps software architects, quality specialists and developers. Architects and quality assurers can express a high-level view of how their product works. DRAKON serves them to explain the dynamics of a software system. Software engineers can use...
    Downloads: 25 This Week
    Last Update:
    See Project
  • 9
    TensorFlow-ZH

    TensorFlow-ZH

    Chinese version of the official document of TensorFlow

    The tensorflow-zh repository is a Chinese translation of the official TensorFlow documentation, organized to make the core guides, tutorials, and reference material accessible to Chinese speakers. It was initiated shortly after TensorFlow’s open-sourcing, with translation and proofreading contributions from a community of volunteers who aimed to bridge the language barrier for learners in China and other Mandarin communities. The repo mirrors the structure of the original English docs:...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Cut Cloud Costs with Google Compute Engine Icon
    Cut Cloud Costs with Google Compute Engine

    Save up to 91% with Spot VMs and get automatic sustained-use discounts. One free VM per month, plus $300 in credits.

    Save on compute costs with Compute Engine. Reduce your batch jobs and workload bill 60-91% with Spot VMs. Compute Engine's committed use offers customers up to 70% savings through sustained use discounts. Plus, you get one free e2-micro VM monthly and $300 credit to start.
    Try Compute Engine
  • 10
    matplotlib
    Matplotlib is a python library for making publication quality plots using a syntax familiar to MATLAB users. Matplotlib uses numpy for numerics. Output formats include PDF, Postscript, SVG, and PNG, as well as screen display. As of matplotlib version 1.5, we are no longer making file releases available on SourceForge. Please visit http://matplotlib.org/users/installing.html for help obtaining matplotlib.
    Leader badge
    Downloads: 50 This Week
    Last Update:
    See Project
  • 11
    venom - shellcode generator

    venom - shellcode generator

    msfvenom shellcode generator/compiler/listenner

    The script will use msfvenom (metasploit) to generate shellcode in diferent formats ( c | python | ruby | dll | msi | hta-psh ), injects the shellcode generated into one funtion (example: python) "the python funtion will execute the shellcode in ram" and uses compilers like: gcc (gnu cross compiler) or mingw32 or pyinstaller to build the executable file, also starts a multi-handler to recibe the remote connection (reverse shell or meterpreter session). -- 'shellcode generator' tool...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    PyLogAnalyser

    A Python multiplatform tool to filter, colorise and analyse logs

    PyLogAnalyzer is a tool that receives an input log in black and white, a configuration INI file, which contains the list of rules to process the input, and an output file where to save the results. These rules permit to detect an input line according to a regular expression (regex) or line number range, filter it, pass it, colorise in foreground and background, columnise the groups of the regex and enable or disable the rule. The final goal of this tool is to ameliorate reading long and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    python library with utility classes for: - access mysql via - nevow / form - mangaing form and new field for form - building pdf report with reportlab
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Reporting engine library written in C. Create one XML file and generate PDF, HTML, TXT, and CSV reports based on queries. Has support for MySQL, PostgreSQL, ODBC. Bindings for PHP, Java, Python.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Python module and command line utility that analyzes XML output from the program pdftohtml in order to extract tables from PDF files. Outputs CSV.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Epydoc is a tool for generating API documentation for Python modules, based on their docstrings. Epydoc supports two output formats (HTML and PDF), and four markup languages for docstrings (Epytext, Javadoc, ReStructuredText, and plaintext).
    Downloads: 10 This Week
    Last Update:
    See Project
  • 17
    JLink lets users author flow charts based on ISO 5807 and IBM standards. Developers can use JLink to add flowcharts to applications, serve a flow chart over the web in PDF or PNG, or dynamically create a flowchart with Javascript, Python or Ruby scripts
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Python module and program to extract documentation strings from python functions, classes, and methods and transform them into LaTeX, PDF, HTML, or HTB (wxWidgets help viewer) documents. All file distributions contain compiled help files in PDF and HTB.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Library to generate pdf archive contend billet for the net bank Brazilian.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    reppy is a PDF-Report Generator for databases (MySQL, Postgres, CSV) written in Python. The report definition is based on an XML-template, which can be edited with the included program XTRed. It needs the python library reportlab for pdf-creation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Python library and command line tool to generate maps in PDF format an place objects on them.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    OOpyREP is a python code generating filter and library. It reads a OpenOffice.org file and creates a python representation of the document structure as well as contents. The generated code uses the reportlab PDF library to render the document.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    POST (Python Obviously Simple Text) provides support for simple, flexible dynamic document generation in multiple output formats. Supports inputs in text or XML, outputs in HTML, PDF, RTF, LaTeX source, nroff source, postscript, and plain text.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB
Gen AI apps are built with MongoDB Atlas
Atlas offers built-in vector search and global availability across 125+ regions. Start building AI apps faster, all in one place.
Try Free →