Search Results for "python pdf scaper" - Page 13

Showing 334 open source projects for "python pdf scaper"

View related business solutions
  • Go from Data Warehouse to Data and AI platform with BigQuery Icon
    Go from Data Warehouse to Data and AI platform with BigQuery

    Build, train, and run ML models with simple SQL. Automate data prep, analysis, and predictions with built-in AI assistance from Gemini.

    BigQuery is more than a data warehouse—it's an autonomous data-to-AI platform. Use familiar SQL to train ML models, run time-series forecasts, and generate AI-powered insights with native Gemini integration. Built-in agents handle data engineering and data science workflows automatically. Get $300 in free credit, query 1 TB, and store 10 GB free monthly.
    Try BigQuery Free
  • Build AI Apps with Gemini 3 on Vertex AI Icon
    Build AI Apps with Gemini 3 on Vertex AI

    Access Google’s most capable multimodal models. Train, test, and deploy AI with 200+ foundation models on one platform.

    Vertex AI gives developers access to Gemini 3—Google’s most advanced reasoning and coding model—plus 200+ foundation models including Claude, Llama, and Gemma. Build generative AI apps with Vertex AI Studio, customize with fine-tuning, and deploy to production with enterprise-grade MLOps. New customers get $300 in free credits.
    Try Vertex AI Free
  • 1
    This product allows you to use a server instance of OpenOffice.org2 and gives you a simple way to build "macro" and to manage your documents. It has been used to build complex PDF documents from many MSWord documents with table of contents, using templ
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Python module and program to extract documentation strings from python functions, classes, and methods and transform them into LaTeX, PDF, HTML, or HTB (wxWidgets help viewer) documents. All file distributions contain compiled help files in PDF and HTB.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Library to generate pdf archive contend billet for the net bank Brazilian.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    A simple GUI frontend for scanning documents into PDF format. Utilizes scanimage, ps2pdf, pnmflip, and pnmtops commands. Automatically detects scanners avaliable on system. Developed on Linux but might work on other platforms with some tweaking.
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    PDFgroup is a Glade-2 and Python user interface wich select some pdf files and merge them into an only pdf file using pdftk. It should work over GNU/Linux devices but maybe in other O.S. where pdftk works.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    A web-based search interface tailored to the New Zealand Gazette PDF archive for the NZ library community. A generic Python-based Swish-e search interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    reppy is a PDF-Report Generator for databases (MySQL, Postgres, CSV) written in Python. The report definition is based on an XML-template, which can be edited with the included program XTRed. It needs the python library reportlab for pdf-creation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    savepex is a backup software writen in python. It allow you to select (also drag & drop) multiple file/directory, save your preferences in a file and recall it. savepex support command-line. All the operation are logged into a file and in a pdf file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    pdfplayground is a collection of tools written in python to read/write PDF files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Easily Host LLMs and Web Apps on Cloud Run Icon
    Easily Host LLMs and Web Apps on Cloud Run

    Run everything from popular models with on-demand NVIDIA L4 GPUs to web apps without infrastructure management.

    Run frontend and backend services, batch jobs, host LLMs, and queue processing workloads without the need to manage infrastructure. Cloud Run gives you on-demand GPU access for hosting LLMs and running real-time AI—with 5-second cold starts and automatic scale-to-zero so you only pay for actual usage. New customers get $300 in free credit to start.
    Try Cloud Run Free
  • 10
    Process OpenOffice.org Writer Files and transform them to PDF without installing OpenOffice.org What is PyOpenOffice? * It is a class library, written in the Python Language. * It is a platform-independent command-line utility (many abilitie
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    pydocrawl automatically downloads pdf-, ps- and doc- files from web sites. An initial URL and a wordlist must be given. Multithreaded information mining (harvesting) tool written entirely in Python. Version 0.1 successfully runs on Linux and Cygwin.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Python library and command line tool to generate maps in PDF format an place objects on them.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    PyPDFSlide is a python program which permits to generate pdf presentations using xml descriptions. Springs, boxes, ... allow to facilitate placement. It supports pdf, bitmap and TeX inclusions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    xtopdf: Tools to convert other formats (x) to PDF; x as in math. - solve for x :-) Currently x == {.txt, .DBF}. Others to follow. Benefits: all those of PDF (better cross-platform viewing/printing, read-only, etc.)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    pyCatalog is a Python, MySQL, wxPython, Reportlab application specifically usable in library and information centers. It simply produces book catalog and card catalog in pdf format rendered using reportlab. The program takes MARC file as its source data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Search and index text in an archive of pdf files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    SomReport is an XML to Image/PDF library written in python. It allows creation of reports with images, charts, graphs, mail stickers, etc. This library is based on an excellent PDF python package from reportlab.com. It can be accessed as cgi script, syst
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    MightyCal is a web-based calendaring system that aims to be the "MS Access {tm}" of the calendaring world. Built on Zope and Apache Cocoon, MightyCal provides the ability to develop custom calendars, with output in HTML, PDF, RTF & many other formats
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    OOpyREP is a python code generating filter and library. It reads a OpenOffice.org file and creates a python representation of the document structure as well as contents. The generated code uses the reportlab PDF library to render the document.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    MakePDF provides MS Windows users with a GUI and command-line environment for converting Postscript files to PDF. Utilising the standard MS Windows environment simplifies this activity for the average Windows users.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    POST (Python Obviously Simple Text) provides support for simple, flexible dynamic document generation in multiple output formats. Supports inputs in text or XML, outputs in HTML, PDF, RTF, LaTeX source, nroff source, postscript, and plain text.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    PyGhostscript is a python extension module which makes the rasterized output of ghostscript accessible to python. It enables python programs to display postscript and PDF documents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    IDEA is a package for input and output of data out of/ into a database. Beginning as a web-application, IDEA generates your HTML-forms for the input and gives you some HTML- or PDF-output back. Everything IDEA does comes from one XML-file per form.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    A python module providing a cross-platform and cross-media drawing toolkit. It handles lines, polygons, curves, figures, and text--powerful enough for professional plotting--including support for wxWindows, Tk, Quickdraw, PDF, Postscript, and PIL.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 25
    Fully OO python PDF generation library.
    Downloads: 0 This Week
    Last Update:
    See Project