Showing 23 open source projects for "python pdf scaper"

View related business solutions
  • Cut Cloud Costs with Google Compute Engine Icon
    Cut Cloud Costs with Google Compute Engine

    Save up to 91% with Spot VMs and get automatic sustained-use discounts. One free VM per month, plus $300 in credits.

    Save on compute costs with Compute Engine. Reduce your batch jobs and workload bill 60-91% with Spot VMs. Compute Engine's committed use offers customers up to 70% savings through sustained use discounts. Plus, you get one free e2-micro VM monthly and $300 credit to start.
    Try Compute Engine
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Zerox OCR

    Zerox OCR

    PDF to Markdown with vision models

    A dead simple way of OCR-ing a document for AI ingestion. Documents are meant to be a visual representation after all. With weird layouts, tables, charts, etc. The vision models just make sense. ZeroX is an open-source machine learning framework designed for fast experimentation and production deployment, optimized for speed and ease of use.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    Pix2Text

    Pix2Text

    Open-Source Python3 tool for recognizing layouts, tables, and math

    ...A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported. Pix2Text (P2T) aims to be a free and open-source Python alternative to Mathpix, and it can already accomplish Mathpix's core functionality. Pix2Text (P2T) can recognize layouts, tables, images, text, and mathematical formulas, and integrate all of these contents into Markdown format. P2T can also convert an entire PDF file (which can contain scanned images or any other format) into Markdown format.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 3
    Remarkable for Linux

    Remarkable for Linux

    The Markdown Editor for Linux

    With Live Preview you can see your changes as you make them. There is no need to export first to check your syntax. This is accompanied by synchronized scrolling. Remarkable has Github Flavoured Markdown. This has a simple, easy-to-learn syntax with features like checklists, highlighting, links, images and more. Remarkable allows you to export your files to PDF and HTML from within the app. The HTML code is even prettified and PDFs have a TOC. You can style your markdown documents however...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    bookdown

    bookdown

    Authoring Books and Technical Documents with R Markdown

    ...A markup language easier to learn than LaTeX, and to write elements such as section headers, lists, quotes, figures, tables, and citations. Multiple choices of output formats: PDF, LaTeX, HTML, EPUB, and Word. Possibility of including dynamic graphics and interactive applications (HTML widgets and Shiny apps) Support for languages other than R, including C/C++, Python, and SQL, etc. LaTeX equations, theorems, and proofs work for all output formats. Can be published to GitHub, bookdown.org, and any web servers. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Cut Data Warehouse Costs up to 54% with BigQuery Icon
    Cut Data Warehouse Costs up to 54% with BigQuery

    Migrate from Snowflake, Databricks, or Redshift with free migration tools. Exabyte scale without the Exabyte price.

    BigQuery delivers up to 54% lower TCO than cloud alternatives. Migrate from legacy or competing warehouses using free BigQuery Migration Service with automated SQL translation. Get serverless scale with no infrastructure to manage, compressed storage, and flexible pricing—pay per query or commit for deeper discounts. New customers get $300 in free credit.
    Try BigQuery Free
  • 5
    Scribus

    Scribus

    Powerful desktop publishing software

    Scribus is an Open Source program that brings professional page layout to Linux, BSD UNIX, Solaris, OpenIndiana, GNU/Hurd, Mac OS X, OS/2 Warp 4, eComStation, and Windows desktops with a combination of press-ready output and new approaches to page design. Underneath a modern and user-friendly interface, Scribus supports professional publishing features, such as color separations, CMYK and spot colors, ICC color management, and versatile PDF creation.
    Leader badge
    Downloads: 14,739 This Week
    Last Update:
    See Project
  • 6
    Apache OpenOffice

    Apache OpenOffice

    The free and Open Source productivity suite

    Free alternative for Office productivity tools: Apache OpenOffice - formerly known as OpenOffice.org - is an open-source office productivity software suite containing word processor, spreadsheet, presentation, graphics, formula editor, and database management applications. OpenOffice is available in many languages, works on all common computers, stores data in ODF - the international open standard format - and is able to read and write files in other formats, included the format used by the...
    Leader badge
    Downloads: 298,066 This Week
    Last Update:
    See Project
  • 7
    MediaWiki To LaTeX converts MediaWiki markup to LaTeX and generates a PDF. So it provides an export from MediaWiki to LaTeX. It works with any project running MediaWiki, especially Wikipedia and Wikibooks.
    Leader badge
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    Eugraphios

    Eugraphios

    Free, portable desktop Computer-Assisted Translation (CAT) tool.

    Eugraphios is a free, portable desktop Computer-Assisted Translation (CAT) tool designed for freelancers. Whether you're translating documents, websites, or software, Eugraphios is designed to meet your needs and exceed your expectations. With a focus on intuitive design and user-friendly interfaces, Eugraphios aims to eliminate the complexity that often hinders professionals and beginners in the translation field. By providing a seamless and enjoyable experience, this tool empowers users...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9

    SimpleTextFormatter

    STF automatically generates documentation

    STF is a system of automatically generating documentation under control of a program or a script. It is frequently used to automatically generate test reports. STF is also used to clean up the output of a process and turn it into a nice looking report.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Managed MySQL, PostgreSQL, and SQL Databases on Google Cloud Icon
    Managed MySQL, PostgreSQL, and SQL Databases on Google Cloud

    Get back to your application and leave the database to us. Cloud SQL automatically handles backups, replication, and scaling.

    Cloud SQL is a fully managed relational database for MySQL, PostgreSQL, and SQL Server. We handle patching, backups, replication, encryption, and failover—so you can focus on your app. Migrate from on-prem or other clouds with free Database Migration Service. IDC found customers achieved 246% ROI. New customers get $300 in credits plus a 30-day free trial.
    Try Cloud SQL Free
  • 10
    Text-ly

    Text-ly

    Text.ly - An alternative for Notepad.

    LOOKING FOR Text Editor? You've Come At The Right Place! Editing Your text for your simplicity A Text editor for Editing Text....! Just download and install and use as an alternative for typical Notepad. This application is compiled from the Pyinstaller library so don't mind there is a vulnerability or something the antivirus program might show it as malware or trojan this happens with most of the apps compiled from the Pyinstaller library. So No worries There is not any malware or virus...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    DocBook to LaTeX Publishing transforms your SGML/XML DocBook documents to DVI, PostScript or PDF by translating them in pure LaTeX as a first process. MathML 2.0 markups are supported too. It started as a clone of DB2LaTeX.
    Leader badge
    Downloads: 98 This Week
    Last Update:
    See Project
  • 12
    Canorus

    Canorus

    Music score editor

    Canorus is a free cross-platform music score editor. It supports an unlimited number and length of staffs, polyphony, a MIDI playback of notes, chord markings, lyrics, import/export filters to formats like MIDI, MusicXML, ABC Music, MusiXTeX and LilyPond
    Leader badge
    Downloads: 18 This Week
    Last Update:
    See Project
  • 13
    PDF-Shuffler
    PDF-Shuffler is a small python-gtk application, which helps the user to merge or split pdf documents and rotate, crop and rearrange their pages using an interactive and intuitive graphical interface. It is a frontend for python-pyPdf.
    Downloads: 60 This Week
    Last Update:
    See Project
  • 14

    Dvipdfm tool for SCons

    SCons tool to cooperate with dvipdfm program

    SCons is a make replacement providing a range of enhanced features such as automated dependency generation and built in compilation cache support. SCons rule sets are Python scripts so as well as the features it provides itself SCons allows you to use the full power of Python to control compilation. This is a SCons extension (tool) which enables usage of the dvipdfm program to convert dvi files to pdf.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Cute Klava
    Cute Klava is now part of LibreEngineering suite: http://sourceforge.net/projects/libreeng/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    xml2txt is a text formatter for XMl in the same way the FO is a PDF formatter. It uses python to convert an XML document to well-formatted text, wtih borders, indents, and tables.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    The aim of this project is to develop a Portable Document Format (PDF) importer for OpenOffice.org Writer based on XPDF. This project was inspired by the PDF importer within KWord.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    Wyneken is a content-oriented text processor that makes your life as a student easier by allowing you to create and manage digital notebooks. Wyneken also allows you to create PDF presentations, letters, articles, and reports. In 2015, Wyneken may or may not work with the latest Linux distributions, but you can use it for building pdfs by pulling our docker repo indicated in the homepage field of this site. The docker site has the information on how to invoke the container, which is...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    WikiPDF is a mediawiki extension based on Wiki2PDF that adds PDF/LaTeX features to mediawiki. Wiki2PDF is a python script to convert multiple articles of a mediawiki based wiki (pre-configured to use with www.wikipedia.org) to a single LaTeX or PDF file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Typewriter converts plain text files to PDF. It allows users to create underlined text and superscript. The PDF created looks exactly as the text, with the same line and page breaks, allowing users the simplicity and control of a typewriter.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    pdfplayground is a collection of tools written in python to read/write PDF files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Process OpenOffice.org Writer Files and transform them to PDF without installing OpenOffice.org What is PyOpenOffice? * It is a class library, written in the Python Language. * It is a platform-independent command-line utility (many abilitie
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    xtopdf: Tools to convert other formats (x) to PDF; x as in math. - solve for x :-) Currently x == {.txt, .DBF}. Others to follow. Benefits: all those of PDF (better cross-platform viewing/printing, read-only, etc.)
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB
Gen AI apps are built with MongoDB Atlas
Atlas offers built-in vector search and global availability across 125+ regions. Start building AI apps faster, all in one place.