Showing 626 open source projects for "pdf text"

View related business solutions
  • Our Free Plans just got better! | Auth0 by Okta Icon
    Our Free Plans just got better! | Auth0 by Okta

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your secuirty. Auth0 now, thank yourself later.
    Try free now
  • Bright Data - All in One Platform for Proxies and Web Scraping Icon
    Bright Data - All in One Platform for Proxies and Web Scraping

    Say goodbye to blocks, restrictions, and CAPTCHAs

    Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.
    Get Started
  • 1
    pdf-extractor

    pdf-extractor

    Node.js module for rendering pdf pages to images, svgs and HTML files

    Pdf-extractor is a wrapper around pdf.js to generate images, svgs, html files, text files and json files from a pdf on node.js. A DOM Canvas is used to render and export the graphical layer of the pdf. Canvas exports *.png as a default but can be extended to export to other file types like .jpg. Pdf objects are converted to svg using the SVGGraphics parser of pdf.js. Pdf text is converted to HTML. This can be used as a (transparent) layer over the image to enable text selection. Pdf text...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Markdown PDF

    Markdown PDF

    Markdown converter for Visual Studio Code

    This extension converts Markdown files to PDF, HTML, PNG or JPEG files.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    Asciidoctor PDF

    Asciidoctor PDF

    Asciidoctor PDF: A native PDF converter for AsciiDoc

    A fast text processor & publishing toolchain for converting AsciiDoc to HTML5, DocBook & more. Asciidoctor is a fast, open source, Ruby-based text processor for parsing AsciiDoc® into a document model and converting it to output formats such as HTML 5, DocBook 5, manual pages, PDF, EPUB 3, and other formats. Asciidoctor also has an ecosystem of extensions, converters, build plugins, and tools to help you author and publish content written in AsciiDoc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Super-PDF-Editor

    Super-PDF-Editor

    World's most comprehensive, powerful, process-based PDF editor

    World's most comprehensive, powerful, process-based and lighting fast PDF reader, editor and batch processor. PDF editing with 60+ features rich tools and function like OCR pdf and images and produce output like searchable PDF, Text, Hocr, Box, Unlv. Also, improve image enhancement before OCR operation for better OCR performance. pdf Imposition, etc. Super PDF Editor is best for bulk pdf processing, especially for the printing industry. Easy pdf imposition, booklet, n ups pages, and more. OCR...
    Downloads: 49 This Week
    Last Update:
    See Project
  • Red Hat Enterprise Linux on Microsoft Azure Icon
    Red Hat Enterprise Linux on Microsoft Azure

    Deploy Red Hat Enterprise Linux on Microsoft Azure for a secure, reliable, and scalable cloud environment, fully integrated with Microsoft services.

    Red Hat Enterprise Linux (RHEL) on Microsoft Azure provides a secure, reliable, and flexible foundation for your cloud infrastructure. Red Hat Enterprise Linux on Microsoft Azure is ideal for enterprises seeking to enhance their cloud environment with seamless integration, consistent performance, and comprehensive support.
    Learn More
  • 5
    Super-PDF-Editor-Lite

    Super-PDF-Editor-Lite

    World's most comprehensive, powerful, process-based PDF editor

    World's most comprehensive, powerful, process-based and lighting fast PDF reader, editor and batch processor. Includes features like Create PDF from Images, HTML, Text files. Create a processing log file. Extract Page, Split Page, Rotate Page, Merge Page, Duplicate page, Move Page, Printing, and Compress Page. Improve image enhancement before OCR operation for better OCR performance. pdf Imposition, etc. Super PDF Editor is best for bulk pdf processing, especially for the printing industry...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 6
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7

    Tesseract OCR

    Open Source OCR Engine

    ... various output formats, including plain text, HTML, PDF and more. It also has unicode (UTF-8) support.
    Downloads: 1,391 This Week
    Last Update:
    See Project
  • 8
    TeXworks

    TeXworks

    A simple interface for working with TeX documents

    TeXworks is a free and simple working environment for authoring TeX (LaTeX, ConTeXt and XeTeX) documents. Inspired by Dick Koch's award-winning TeXShop program for Mac OS X, it makes entry into the TeX world easier for those using desktop operating systems other than OS X. It provides an integrated, easy-to-use environment for users on other platforms particularly GNU/Linux and Windows and features a clean, simple interface accessible to casual and non-technical users.
    Downloads: 194 This Week
    Last Update:
    See Project
  • 9
    PDF4QT

    PDF4QT

    Open source PDF editor

    ... the license LGPLv3. The applications are primarily used by target users to view, edit, manipulate or compare PDF documents. Users can preview these applications in the screenshots section of this webpage. Basic browsing and lots of other functionalities, such as encryption, reading a document, verification of digital signatures, editing of annotations, searching for text using regular expressions, drawing pages into an image, and much more. Several plug-ins are available.
    Downloads: 67 This Week
    Last Update:
    See Project
  • Secure remote access solution to your private network, in the cloud or on-prem. Icon
    Secure remote access solution to your private network, in the cloud or on-prem.

    Deliver secure remote access with OpenVPN.

    OpenVPN is here to bring simple, flexible, and cost-effective secure remote access to companies of all sizes, regardless of where their resources are located.
    Get started — no credit card required.
  • 10
    OCRmyPDF

    OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files

    OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.
    Downloads: 25 This Week
    Last Update:
    See Project
  • 11
    ChatGPT Desktop Application

    ChatGPT Desktop Application

    🔮 ChatGPT Desktop Application (Mac, Windows and Linux)

    ChatGPT Desktop Application (Mac, Windows and Linux)
    Downloads: 90 This Week
    Last Update:
    See Project
  • 12
    PDFsam

    PDFsam

    PDFsam, a desktop application to split, merge, mix, rotate PDF files

    PDFsam Basic is our free and open-source desktop application to split, merge, extract pages, rotate and mix PDF files. PDFsam Visual is a powerful tool to visually compose PDF files, reorder pages, delete pages, split, merge, rotate, encrypt, decrypt, extract text, convert to grayscale, crop PDF files. PDFsam Basic is written using JavaFX. Since version 4 it is released as a self-contained application and bundles a jlinked JDK while version 3 requires a Java Runtime Environment 8 with JavaFx...
    Downloads: 39 This Week
    Last Update:
    See Project
  • 13
    KOReader

    KOReader

    An ebook reader application supporting PDF, DjVu, EPUB, FB2, etc.

    KOReader is a document viewer for E Ink devices. Supported fileformats include EPUB, PDF, DjVu, XPS, CBT, CBZ, FB2, PDB, TXT, HTML, RTF, CHM, DOC, MOBI and ZIP files. It’s available for Kindle, Kobo, PocketBook, Android and desktop Linux. Runs on embedded devices (Cervantes, Kindle, Kobo, PocketBook, reMarkable), Android and Linux computers. Developers can run a KOReader emulator in Linux and MacOS. Multi-lingual user interface with a highly customizable reader view and many typesetting options...
    Downloads: 75 This Week
    Last Update:
    See Project
  • 14
    JupyterLab

    JupyterLab

    JupyterLab computational environment

    ... new workflows for interactive computing. JupyterLab also offers a unified model for viewing and handling data formats. JupyterLab understands many file formats (images, CSV, JSON, Markdown, PDF, Vega, Vega-Lite, etc.) and can also display rich kernel output in these formats. See File and Output Formats for more information. To navigate the user interface, JupyterLab offers customizable keyboard shortcuts and the ability to use key maps from vim, emacs, and Sublime Text in the text editor.
    Downloads: 64 This Week
    Last Update:
    See Project
  • 15
    Papermerge

    Papermerge

    Open Source Document Management System for Digital Archives

    Papermerge is an open source document management system (DMS) primarily designed for archiving and retrieving your digital documents. Instead of having piles of paper documents all over your desk, office or drawers - you can quickly scan them and configure your scanner to directly upload to Papermerge DMS. Store, organize and index scanned documents in PDF, JPEG and TIFF formats. Instantly find relevant information using full text, tags and metadata-based search. Papermerge is free and open...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 16
    Frescobaldi

    Frescobaldi

    LilyPond sheet music text editor

    Frescobaldi is a free and open source LilyPond sheet music text editor. Designed to be powerful yet lightweight and easy-to-use, Frescobaldi offers great functionality and a host of useful features such as music view with advanced two-way Point & Click, Midi capturing to enter music, a Snippet Manager and many more. Frescobaldi is named after Girolamo Frescobaldi (1583-1643), an Italian composer of keyboard music in the late Renaissance and early Baroque period.
    Downloads: 18 This Week
    Last Update:
    See Project
  • 17
    ripgrep

    ripgrep

    Regex pattern directory search tool that respects your .gitignore

    ... could be PDF text extraction, less supported decompression, decrypting, automatic encoding detection and so on. In other words, use ripgrep if you like speed, filtering by default, fewer bugs and Unicode support.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 18
    pdfcook

    pdfcook

    Prepress preparing tool and PDF editor

    Preprinting preparation tool for PDF ebooks. On windows create a folder build/ beside src/ directory. PDF v1.7 support. Decrypt encrypted PDFs. Join or Split PDFs. Scale to any paper size, with specified margin. Write Page numbers. Write text and transform pages (rotate, flip, move). Booklet format arrange. 2 or 4 pages per page (2-up, 4-up). More readable output syntax for easy debugging.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    PdfPig

    PdfPig

    Read and extract text and other content from PDFs in C#

    This project allows users to read and extract text and other content from PDF files. In addition the library can be used to create simple PDF documents containing text and geometrical shapes.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Sphinx

    Sphinx

    Main repository for the Sphinx documentation builder

    Sphinx is a tool that makes it easy to create intelligent and beautiful documentation, written by Georg Brandl and licensed under the BSD license. It was originally created for the Python documentation, and it has excellent facilities for the documentation of software projects in a range of languages. Of course, this site is also created from reStructuredText sources using Sphinx! HTML (including Windows HTML Help), LaTeX (for printable PDF versions), ePub, Texinfo, manual pages, plain text...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 21
    Eisvogel

    Eisvogel

    A pandoc LaTeX template to convert markdown files to PDF or LaTeX

    A clean pandoc LaTeX template to convert your markdown files to PDF or LaTeX. It is designed for lecture notes and exercises with a focus on computer science. The template is compatible with Pandoc 3. Alternatively, if you don't want to install LaTeX, you can use the Docker image named pandoc/extra. The image contains pandoc, LaTeX, and a curated selection of components such as the eisvogel template, pandoc filters, and open source fonts.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    Image Toolbox

    Image Toolbox

    Image Toolbox is an powerful picture editor, which can crop

    Image Toolbox is a powerful picture editor, which can crop, apply filters, add some drawings, erase background, edit EXIF, or even create a PDF file.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 23
    Logseq

    Logseq

    A privacy-first, open-source platform for knowledge management

    Logseq is a privacy-first, open-source knowledge base that works on top of local plain-text Markdown and Org-mode files. Use it to write, organize and share your thoughts, keep your to-do list, and build your own digital garden. Logseq is a platform for knowledge management and collaboration. It focuses on privacy, longevity, and user control. The server will never store or analyze your private notes. Your data are plain text files and we currently support both Markdown and Emacs Org-mode (more...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 24
    pdfmake

    pdfmake

    Client/server side PDF printing in pure JavaScript

    ... (client-side) and in Node.js (server-side). PDF name can be defined only by using metadata title property. Add-ons used in browsers can affect the functionality of pdfmake (especially for open() and print()). If pdfmake is not working try disable add-ons in browser. The supported browsers are Internet Explorer 10+, Edge 12+, Firefox, Chrome, Opera and Safari.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    PaperQA2

    PaperQA2

    High accuracy RAG for answering questions from scientific documents

    PaperQA2 is a package for doing high-accuracy retrieval augmented generation (RAG) on PDFs or text files, with a focus on the scientific literature. See our recent 2024 paper to see examples of PaperQA2's superhuman performance in scientific tasks like question answering, summarization, and contradiction detection. In this example we take a folder of research paper PDFs, magically get their metadata - including citation counts and a retraction check, then parse and cache PDFs into a full-text...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next