Showing 38 open source projects for "apache pdf"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    PDF.js

    PDF.js

    A PDF Reader in JavaScript

    PDF.js is a web standards-based platform for parsing and rendering Portable Document Formats (PDFs). Open source and built with HTML5, this PDF viewer is supported by a great community and Mozilla Labs. PDF.js can be used on both modern and older browsers, and is built into version 19+ of Firefox.
    Downloads: 68 This Week
    Last Update:
    See Project
  • 2
    OpenDataLoader PDF

    OpenDataLoader PDF

    PDF Parser for AI-ready data. Automate PDF accessibility

    OpenDataLoader PDF is an open-source document processing system designed to convert complex PDF files into structured, AI-ready formats such as Markdown, JSON, and HTML while preserving layout, hierarchy, and semantic meaning. It focuses on enabling downstream use cases like retrieval-augmented generation (RAG), knowledge extraction, and document intelligence pipelines by maintaining accurate reading order and spatial metadata through bounding boxes. The tool combines deterministic parsing...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    pdfcpu

    pdfcpu

    A PDF processor written in Go

    pdfcpu is a PDF processing library written in Go supporting encryption. It provides both an API and a CLI. Supported are all versions up to PDF 1.7 (ISO-32000). This is an effort to build a comprehensive PDF processing library from the ground up written in Go. Over time pdfcpu aims to support the standard range of PDF processing features and also any interesting use cases that may present themselves along the way. The main focus lies on strong support for batch processing and scripting via a...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 4
    Vanilla.PDF

    Vanilla.PDF

    Cross-platform SDK for creating and modifying PDF documents

    Vanilla.PDF is a modern, high-performance, open-source C++17 SDK designed for creating, editing, signing, and analyzing PDF documents across multiple platforms. It requires no external runtime dependencies, making it lightweight and ideal for embedding into desktop applications, servers, or automation pipelines. The SDK offers full cross-platform support including Windows, Linux, macOS, and Android, with builds available for major compilers and architectures. Vanilla.PDF supports advanced...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Save Up to 91% on Cloud Compute With Spot VMs Icon
    Save Up to 91% on Cloud Compute With Spot VMs

    Automatic sustained-use discounts. One free VM per month. No negotiation needed.

    Run batch jobs at 60-91% off with Spot VMs. Long-running workloads get automatic discounts with sustained use.
    Try Free
  • 5
    dvisvgm

    dvisvgm

    A fast DVI, EPS, and PDF to SVG converter

    The command-line utility dvisvgm is a tool for TEX/LATEX users. It converts DVI, EPS, and PDF files to the XML-based vector graphics format SVG. In contrast to bitmap graphics, vector graphics are arbitrarily scalable without loss of quality. All modern web browsers support a large amount of the current SVG standard 1.1. Furthermore, SVG files can also be displayed with the Java-based Squiggle SVG browser which is part of the Apache Batik project, and the free vector graphics editor Inkscape.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    QPDF

    QPDF

    PDF transformation/manipulation program + library

    QPDF is a C++ library and set of programs that inspect and manipulate the structure of PDF files. It can encrypt and linearize files, expose the internals of a PDF file, and do many other operations useful to end users and PDF developers.
    Leader badge
    Downloads: 1,087 This Week
    Last Update:
    See Project
  • 7
    ant4docbook

    ant4docbook

    ANT4DOCBOOK is an ANT task for DOCBOOK

    ANT4DOCBOOK is an ANT task for DOCBOOK, a semantic markup language for technical documentation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    TextExtractor

    TextExtractor

    Extracts plain text from a variety of different file types

    TextExtractor extracts plain text from hundreds of different file types, storing the text extracted in suitably named text files. TextExtractor 1.10 works in six different modes :- Instant Mode - Just select any file and extract the text from it. Batch Mode - Select a group of files and extract the text from all of them in one go. Polling Mode - Watch a folder location, processing new files as they appear there. Hierarchical Mode - Extract Text from files in a directory...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 9
    MathTranslate

    MathTranslate

    translate scientific papers in latex, especially arxiv papers

    This is a project to translate LaTeX documents, especially scientific papers, from any language to any language. LaTeX expressions like math expressions are perfectly kept unchanged. LaTeX documents can finally be compiled into PDF files. Especially it can be directly applied to translate arXiv papers since it provides the LaTeX source code of most of the papers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 10
    PdfDecrypt

    PdfDecrypt

    .NET CLI tool for decrypting pdf files. (Pdf password remover)

    .NET CLI tool for decrypting pdf files. (Pdf password remover)
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    Swagger2Markup

    Swagger2Markup

    Swagger to AsciiDoc or Markdown converter

    The primary goal of this project is to simplify the generation of up-to-date RESTful API documentation by combining documentation that’s been hand-written with auto-generated API documentation produced by Swagger. The result is intended to be an up-to-date, easy-to-read, on- and offline user guide, comparable to GitHub’s API documentation. The output of Swagger2Markup can be used as an alternative to swagger-UI and can be served as static content. Swagger2Markup converts a Swagger JSON or...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Myrtille

    Myrtille

    A native HTML4 / HTML5 Remote Desktop Protocol and SSH client

    Myrtille provides simple and fast access to remote desktops, applications, and SSH servers through a web browser, without any plugin, extension or configuration. Technically, Myrtille is an HTTP(S) to RDP and SSH gateway. User input (keyboard, mouse, touchscreen) is forwarded from a web browser to an HTTP(S) gateway, then up to an RDP (or SSH) client which maintains a session with an RDP (or SSH) server. The display resulting (or not) of such actions is streamed back to the browser, from the...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    PdfJumbler
    A simple tool to rearrange/merge/delete pages from PDF files. The modular backend system uses either JPedal or JPod to display PDFs and iText or Apache PDFBox to save them. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Development of this project has moved to GitHub. Please check https://github.com/mgropp/pdfjumbler for current releases! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    Downloads: 9 This Week
    Last Update:
    See Project
  • 14
    OpenEXI

    OpenEXI

    EXI implementations in Java and C#

    Open source .Net (C#) / Java implementation of the W3C Efficient XML Interchange (EXI) format specification. As a corollary to XML, EXI is an alternative, very efficient format that has all of the mechanics of XML, but is much more compact and is faster to exchange. - README (about Nagasena EXI implemenation) https://www.dropbox.com/s/adh83u9z1x1czv6/README.txt?dl=0 - Nagasena EXI grammar interchange format (PDF) https://www.dropbox.com/s/etrpuchaddplq2s/EXIGram.pdf?dl=0 -...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 15

    eXtensible Text Framework (XTF)

    Framework for search and display of heterogenous document collections.

    NOTICE: This code repository is deprecated. Please visit https://github.com/cdlib/xtf for the latest updates. Obsolete Description: The eXtensible Text Framework (XTF) is an architecture that supports searching across collections of heterogeneous textual data (XML, PDF, HTML, text, and more), and the presentation of results and documents in a highly configurable manner. Includes highly customized versions of the proven open-source components Lucene and Saxon.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    S1000D Transformation Toolkit
    The S1000D Transformation Toolkit provides a reference implementation supporting the transformation, packaging and viewing of S1000D data into a SCORM 2004 3rd Edition Content Package, Mobile Web Application and PDF.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    xstdgreek

    xstdgreek

    A bug-free, idiot-proof, Greek language environment XeLaTeX/LuaLaTex

    Provides a bug-free, idiot-proof, Greek language environment for unicode-enabled LaTeX like XeLaTeX and LuaLaTeX. * The project intents to be supported and controlled by its members as a typical FSF project at sourceforge.net. * This project intents to standardize the Greek macros and Greek usage in unicode-enabled LaTeX. (for example: the 'ano teleia' is the Greek 'semi colon' but there is no standard macro!). * A solution to any Greek related problem. Our goal is to fix any such...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    OpenSearchServer Extractor

    OpenSearchServer Extractor

    A RESTFul/JSON Web Service for text and metata extraction

    An open source RESTFul Web Service for text , meta-data extraction and analysis. oss-text-extractor supports various binary formats: Word processor (doc, docx, odt, rtf) Spreadsheet (xls, xlsx, ods) Presentation (ppt, pptx, odp) Publishing (pdf, pub) Web (rss, html/xhtml) Medias (audio, images) Others (vsd, text)
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    The goal of this project is to provide a reusable library to transform common file formats to content objects and ContentProvider plugins to common file repositories like Filesystem, CMIS and others for iQser GIN Semantic Middleware (www.iqser.com).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    Detexter

    Detexter is an app designed to extract text from PDF files.

    Detexter lets you extract text from multiple PDF files. Detexter uses the PDFBox library for its text extraction.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    eLML - eLesson Markup Language
    eLML (eLesson Markup Language) is an XML framework for creating structured eLessons based on a pedagogical model. eLML consists of an XMLSchema and XSLT files to create XHTML, PDF, LaTeX, IMS CP and SCORM versions, standards supported by most LMS.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Kabeja is a java library for parsing DXF and converting to SVG (dxf2svg). The library supports the SAX-api and can integrated into other applications (Cocoon,Batik). Tools for converting svg to jpeg, tiff, png and pdf are included .
    Leader badge
    Downloads: 59 This Week
    Last Update:
    See Project
  • 23
    pdfInspect
    pdfInspect offers a flexible GUI interface for viewing the internal structure and content of a PDF file. Wraps the Apache PDFBox library; example of application built with Superficial http://superficial.sourceforge.net
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    This project provides a toolkit and framework based on PDFBox for document analysis of PDF files and performing custom conversion tasks and is published under the Apache licence. A GUI is also included, and is published using the GPL licence.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    xccdf2pdf renders XCCDF documents in PDF and other formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo