Showing 186 open source projects for "apache pdf"

View related business solutions
  • Our Free Plans just got better! | Auth0 by Okta Icon
    Our Free Plans just got better! | Auth0 by Okta

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your secuirty. Auth0 now, thank yourself later.
    Try free now
  • Bright Data - All in One Platform for Proxies and Web Scraping Icon
    Bright Data - All in One Platform for Proxies and Web Scraping

    Say goodbye to blocks, restrictions, and CAPTCHAs

    Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.
    Get Started
  • 1
    PDF.js

    PDF.js

    A PDF Reader in JavaScript

    PDF.js is a web standards-based platform for parsing and rendering Portable Document Formats (PDFs). Open source and built with HTML5, this PDF viewer is supported by a great community and Mozilla Labs. PDF.js can be used on both modern and older browsers, and is built into version 19+ of Firefox.
    Downloads: 188 This Week
    Last Update:
    See Project
  • 2
    Apache OpenOffice

    Apache OpenOffice

    The free and Open Source productivity suite

    Free alternative for Office productivity tools: Apache OpenOffice - formerly known as OpenOffice.org - is an open-source office productivity software suite containing word processor, spreadsheet, presentation, graphics, formula editor, and database management applications. OpenOffice is available in many languages, works on all common computers, stores data in ODF - the international open standard format - and is able to read and write files in other formats, included the format used...
    Leader badge
    Downloads: 230,202 This Week
    Last Update:
    See Project
  • 3

    Tesseract OCR

    Open Source OCR Engine

    ... various output formats, including plain text, HTML, PDF and more. It also has unicode (UTF-8) support.
    Downloads: 1,577 This Week
    Last Update:
    See Project
  • 4
    ZXing

    ZXing

    Barcode scanning library for Java, Android

    ZXing or “Zebra Crossing” is an open source multi-format 1D/2D barcode image processing library that’s been implemented in Java, and also comes with ports to other languages. It currently supports the following formats: UPC-A and UPC-E EAN-8 and EAN-13 Code 39 Code 93 Code 128 ITF Codabar RSS-14 (all variants) RSS Expanded (most variants) QR Code Data Matrix Aztec ('beta' quality) PDF 417 ('alpha' quality) MaxiCode ZXing is made up of several modules, including a core image...
    Downloads: 137 This Week
    Last Update:
    See Project
  • Empower decisions with IBM SPSS Statistics. Icon
    Empower decisions with IBM SPSS Statistics.

    For companies that need a powerful data platform

    IBM SPSS Statistics software is used by a variety of customers to solve industry-specific business issues to drive quality decision-making. Advanced statistical procedures and visualization can provide a robust, user friendly and an integrated platform to understand your data and solve complex business and research problems
    Learn More
  • 5
    BingGPT

    BingGPT

    Desktop application of Bing's AI-powered chat (Windows, Mac, Linux)

    Desktop application of new Bing's AI-powered chat 1. Get access to the early preview of new Bing 2. Sign in to your Microsoft account 3. Start chatting
    Downloads: 16 This Week
    Last Update:
    See Project
  • 6
    Papermerge

    Papermerge

    Open Source Document Management System for Digital Archives

    Papermerge is an open source document management system (DMS) primarily designed for archiving and retrieving your digital documents. Instead of having piles of paper documents all over your desk, office or drawers - you can quickly scan them and configure your scanner to directly upload to Papermerge DMS. Store, organize and index scanned documents in PDF, JPEG and TIFF formats. Instantly find relevant information using full text, tags and metadata-based search. Papermerge is free and open...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 7
    Image Toolbox

    Image Toolbox

    Image Toolbox is an powerful picture editor, which can crop

    Image Toolbox is a powerful picture editor, which can crop, apply filters, add some drawings, erase background, edit EXIF, or even create a PDF file.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 8
    PaperQA2

    PaperQA2

    High accuracy RAG for answering questions from scientific documents

    PaperQA2 is a package for doing high-accuracy retrieval augmented generation (RAG) on PDFs or text files, with a focus on the scientific literature. See our recent 2024 paper to see examples of PaperQA2's superhuman performance in scientific tasks like question answering, summarization, and contradiction detection. In this example we take a folder of research paper PDFs, magically get their metadata - including citation counts and a retraction check, then parse and cache PDFs into a...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 9
    pdfcpu

    pdfcpu

    A PDF processor written in Go

    pdfcpu is a PDF processing library written in Go supporting encryption. It provides both an API and a CLI. Supported are all versions up to PDF 1.7 (ISO-32000). This is an effort to build a comprehensive PDF processing library from the ground up written in Go. Over time pdfcpu aims to support the standard range of PDF processing features and also any interesting use cases that may present themselves along the way. The main focus lies on strong support for batch processing and scripting via...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Manage and protect your mobile workforce with AI-driven unified endpoint management (UEM) Icon
    Manage and protect your mobile workforce with AI-driven unified endpoint management (UEM)

    For IT security teams in need of an AI-driven unified endpoint management platform

    IBM® MaaS360® is uniquely equipped to help IT professionals manage a wide variety of endpoints, apps, and data, and protect them efficiently and productively.
    Learn More
  • 10
    PdfPig

    PdfPig

    Read and extract text and other content from PDFs in C#

    This project allows users to read and extract text and other content from PDF files. In addition the library can be used to create simple PDF documents containing text and geometrical shapes.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    PRDownloader

    PRDownloader

    A file downloader library for Android with pause and resume support

    A file downloader library for Android with pause and resume support. PRDownloader can be used to download any type of files like image, video, pdf, apk and etc. This file downloader library supports pause and resume while downloading a file. Supports large file download. This downloader library has a simple interface to make download request. We can check if the status of downloading with the given download Id. PRDownloader gives callbacks for everything like onProgress, onCancel, onStart...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12
    dvisvgm

    dvisvgm

    A fast DVI, EPS, and PDF to SVG converter

    The command-line utility dvisvgm is a tool for TEX/LATEX users. It converts DVI, EPS, and PDF files to the XML-based vector graphics format SVG. In contrast to bitmap graphics, vector graphics are arbitrarily scalable without loss of quality. All modern web browsers support a large amount of the current SVG standard 1.1. Furthermore, SVG files can also be displayed with the Java-based Squiggle SVG browser which is part of the Apache Batik project, and the free vector graphics editor Inkscape.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    Apache OpenOffice Extensions

    Apache OpenOffice Extensions

    Hundreds of ready to use Apache OpenOffice extensions

    The official catalog of Apache OpenOffice extensions. You'll find extensions ranging from dictionaries to tools to import PDF files and to connect with external databases. Extensions can improve your productivity, and are easy to use.
    Leader badge
    Downloads: 4,549 This Week
    Last Update:
    See Project
  • 14
    Asciidoc Editor based on JavaFX 20

    Asciidoc Editor based on JavaFX 20

    Asciidoc Editor and Toolchain written with JavaFX 19

    Asciidoc FX is a WYSIWYG editor for the Asciidoc markup language. You can build PDF, Epub, and HTML books, documents, and slides. Supported Operating Systems and Builds shows the list of available builds with links for reference. If you are looking for the very latest version, visit the link in the note above to be guaranteed of downloading the latest and greatest version of AsciidocFX. AsciidocFX converts documents via the AsciidoctorJ library.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    Outline

    Outline

    Fastest wiki and knowledge base for growing teams

    ..., and checklists. Give new team members a leg up getting to know your product, best practices, and culture. Don't lock away your company handbook in a PDF document hidden on a shared drive. Make it accessible, searchable and easily updatable so everyone can find the information they need. Whether your team are seasoned remote workers or new to working from home. Outline is a great place to keep your team’s shared knowledge accessible, searchable, and coordinated.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    changedetection.io

    changedetection.io

    The best free open source website change detection and restock service

    Loved by smart shoppers, data journalists, research engineers, data scientists, security researchers, and more. From simply monitoring website pages that have a change (such as watching prices, and restocking notifications), to deep inspection such as PDF text support, JSON and XML monitoring, and extensive text triggers. Monitor out-of-stock products and get alerts when those products are back in stock, get restock alerts via Discord, Slack, email, and many other platforms. Using the browser...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    xhtml2pdf

    xhtml2pdf

    A library for converting HTML into PDFs using ReportLab

    xhtml2pdf enables users to generate PDF documents from HTML content easily and with automated flow control such as pagination and keeping text together. The Python module can be used in any Python environment, including Django. The Command line tool is a stand-alone program that can be executed from the command line.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    GROBID

    GROBID

    A machine learning software for extracting information

    GROBID is a machine learning library for extracting, parsing, and re-structuring raw documents such as PDF into structured XML/TEI encoded documents with a particular focus on technical and scientific publications. First developments started in 2008 as a hobby. In 2011 the tool has been made available in open source. Work on GROBID has been steady as a side project since the beginning and is expected to continue as such. Header extraction and parsing from article in PDF format. The extraction...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    picocli

    picocli

    Framework for building GraalVM-enabled command line apps

    ... subcommands. Picocli-based applications can be ahead-of-time compiled to a GraalVM native image, with extremely fast startup time and lower memory requirements, which can be distributed as a single executable file. Picocli generates beautiful documentation for your application (HTML, PDF and Unix man pages). Another distinguishing feature of picocli is how it aims to let users run picocli-based applications without requiring picocli as an external dependency.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    MathTranslate

    MathTranslate

    translate scientific papers in latex, especially arxiv papers

    This is a project to translate LaTeX documents, especially scientific papers, from any language to any language. LaTeX expressions like math expressions are perfectly kept unchanged. LaTeX documents can finally be compiled into PDF files. Especially it can be directly applied to translate arXiv papers since it provides the LaTeX source code of most of the papers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Visual Regression Tracker

    Visual Regression Tracker

    Backend and Frontend application for tracking differences via image

    Open source, self-hosted solution for visual testing and managing results of visual testing. Service receives images, performs pixel-by-pixel comparisons with its previously accepted baseline, and provides immediate results in order to catch unexpected changes. Use implemented libraries to integrate with existing automated suites by adding assertions based on image comparison. We provide native integration with automation libraries, core SDK and Rest API interfaces that allow the system to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    docker-maven-plugin

    docker-maven-plugin

    Maven plugin for running and creating Docker images

    This is a Maven plugin for building Docker images and managing containers for integration tests. It works with Maven 3.0.5 and Docker 1.6.0 or later.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    LangChain Apps on Production with Jina

    LangChain Apps on Production with Jina

    Langchain Apps on Production with Jina & FastAPI

    Jina is an open-source framework for building scalable multi-modal AI apps on Production. LangChain is another open-source framework for building applications powered by LLMs. long-chain-serve helps you deploy your LangChain apps on Jina AI Cloud in a matter of seconds. You can benefit from the scalability and serverless architecture of the cloud without sacrificing the ease and convenience of local development. And if you prefer, you can also deploy your LangChain apps on your own...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Jina

    Jina

    Build cross-modal and multimodal applications on the cloud

    Jina is a framework that empowers anyone to build cross-modal and multi-modal applications on the cloud. It uplifts a PoC into a production-ready service. Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer. Build applications that deliver fresh insights from multiple data types such as text, image, audio, video, 3D mesh, PDF with Jina AI’s DocArray. Polyglot gateway that supports gRPC, Websockets, HTTP...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    word_cloud

    word_cloud

    A little word cloud generator in Python

    ... system being used. The wordcloud_cli tool can be used to generate word clouds directly from the command-line. If you're dealing with PDF files, then pdftotext, included by default with many Linux distribution, comes in handy. Use wordcloud_cli --help so see all available options. The wordcloud library is MIT licenced, but contains DroidSansMono.ttf, a true type font by Google, that is apache licensed.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next