Showing 25 open source projects for "apache pdf"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    OpenDataLoader PDF

    OpenDataLoader PDF

    PDF Parser for AI-ready data. Automate PDF accessibility

    OpenDataLoader PDF is an open-source document processing system designed to convert complex PDF files into structured, AI-ready formats such as Markdown, JSON, and HTML while preserving layout, hierarchy, and semantic meaning. It focuses on enabling downstream use cases like retrieval-augmented generation (RAG), knowledge extraction, and document intelligence pipelines by maintaining accurate reading order and spatial metadata through bounding boxes. The tool combines deterministic parsing...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    ant4docbook

    ant4docbook

    ANT4DOCBOOK is an ANT task for DOCBOOK

    ANT4DOCBOOK is an ANT task for DOCBOOK, a semantic markup language for technical documentation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Swagger2Markup

    Swagger2Markup

    Swagger to AsciiDoc or Markdown converter

    The primary goal of this project is to simplify the generation of up-to-date RESTful API documentation by combining documentation that’s been hand-written with auto-generated API documentation produced by Swagger. The result is intended to be an up-to-date, easy-to-read, on- and offline user guide, comparable to GitHub’s API documentation. The output of Swagger2Markup can be used as an alternative to swagger-UI and can be served as static content. Swagger2Markup converts a Swagger JSON or...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    PdfJumbler
    A simple tool to rearrange/merge/delete pages from PDF files. The modular backend system uses either JPedal or JPod to display PDFs and iText or Apache PDFBox to save them. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Development of this project has moved to GitHub. Please check https://github.com/mgropp/pdfjumbler for current releases! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    Downloads: 9 This Week
    Last Update:
    See Project
  • 99.99% Uptime for MySQL and PostgreSQL Databases Icon
    99.99% Uptime for MySQL and PostgreSQL Databases

    Sub-second maintenance. 2x read/write performance. Built-in vector search for AI apps.

    Cloud SQL Enterprise Plus delivers near-zero downtime with 35 days of point-in-time recovery. Supports MySQL, PostgreSQL, and SQL Server.
    Try Free
  • 5
    OpenEXI

    OpenEXI

    EXI implementations in Java and C#

    Open source .Net (C#) / Java implementation of the W3C Efficient XML Interchange (EXI) format specification. As a corollary to XML, EXI is an alternative, very efficient format that has all of the mechanics of XML, but is much more compact and is faster to exchange. - README (about Nagasena EXI implemenation) https://www.dropbox.com/s/adh83u9z1x1czv6/README.txt?dl=0 - Nagasena EXI grammar interchange format (PDF) https://www.dropbox.com/s/etrpuchaddplq2s/EXIGram.pdf?dl=0 -...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6

    eXtensible Text Framework (XTF)

    Framework for search and display of heterogenous document collections.

    NOTICE: This code repository is deprecated. Please visit https://github.com/cdlib/xtf for the latest updates. Obsolete Description: The eXtensible Text Framework (XTF) is an architecture that supports searching across collections of heterogeneous textual data (XML, PDF, HTML, text, and more), and the presentation of results and documents in a highly configurable manner. Includes highly customized versions of the proven open-source components Lucene and Saxon.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    S1000D Transformation Toolkit
    The S1000D Transformation Toolkit provides a reference implementation supporting the transformation, packaging and viewing of S1000D data into a SCORM 2004 3rd Edition Content Package, Mobile Web Application and PDF.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    OpenSearchServer Extractor

    OpenSearchServer Extractor

    A RESTFul/JSON Web Service for text and metata extraction

    An open source RESTFul Web Service for text , meta-data extraction and analysis. oss-text-extractor supports various binary formats: Word processor (doc, docx, odt, rtf) Spreadsheet (xls, xlsx, ods) Presentation (ppt, pptx, odp) Publishing (pdf, pub) Web (rss, html/xhtml) Medias (audio, images) Others (vsd, text)
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    The goal of this project is to provide a reusable library to transform common file formats to content objects and ContentProvider plugins to common file repositories like Filesystem, CMIS and others for iQser GIN Semantic Middleware (www.iqser.com).
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 10

    Detexter

    Detexter is an app designed to extract text from PDF files.

    Detexter lets you extract text from multiple PDF files. Detexter uses the PDFBox library for its text extraction.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    eLML - eLesson Markup Language
    eLML (eLesson Markup Language) is an XML framework for creating structured eLessons based on a pedagogical model. eLML consists of an XMLSchema and XSLT files to create XHTML, PDF, LaTeX, IMS CP and SCORM versions, standards supported by most LMS.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Kabeja is a java library for parsing DXF and converting to SVG (dxf2svg). The library supports the SAX-api and can integrated into other applications (Cocoon,Batik). Tools for converting svg to jpeg, tiff, png and pdf are included .
    Leader badge
    Downloads: 59 This Week
    Last Update:
    See Project
  • 13
    This project provides a toolkit and framework based on PDFBox for document analysis of PDF files and performing custom conversion tasks and is published under the Apache licence. A GUI is also included, and is published using the GPL licence.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    xccdf2pdf renders XCCDF documents in PDF and other formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    This library provides tools to make a PDF/A preflight on a PDF document. It is highly based on apache PDFBOX. Conformance to the ISO 19005 (PDF/A) norm is checked. The goal is to pass completely the isartor test.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Shared Questionnaire System
    Shared Questionnaire System(SQS) is a full-functional Optical Mark Reader(OMR) form processing system implemented in Java-Swing, XSL-FO and AJAX with straightforward GUIs. It is aimed at developing social platform to share knowledge about questionnaire.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    This simple java application lets you overlap several different single pdf pages into one page. Across all the applications that provide the ability of merging or spliting pdf docs, none that I have tried, have the feature of overlaping pdf pages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Template based Excel/PDF file generator(Java/Apache POI/iText). Does not require a report designer. Templates are created in Excel workbook. Allows to generate complex Excel documents. Simple to use and easy to learn.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    "distribution" is a message and data processing tool. It allows to process information through a graph of processors. It may be used to build mailing lists, fax gateways, email filters, PDF mailing combinators, report systems and many other processes
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    QExcel Converter SQL QT4, Import xml excel/openoffice 2003 format, sqlite3 sql text/binary, edit table and export to various Format, Pdf XSL format objects Apache fop JAVA(XSL-FO), XML/XSLT , excel, SQL text sqlite3 dump file and MYSQL SQL to XML/XSL.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    BTL is a template language that combines power of JSTL and XSLT to produce documents in XML, HTML, XHTML, XSL-FO, PDF or other formats, based on the JavaBean input.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Visual xsltproc is a tool which help to write xslt file, and debug it to find errors. It writes xml, and generates xml (Syntax highlighting of XML & line Nr.). Finally if the result is XSL-FO it generates the pdf on Apache FOP java. Build on QT4.2.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Translator! We'll start with docx to pdf.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Paper2bibtex is a modular information harvester implemented in Java which extracts name and author information from scientific papers (PS or PDF), adds paper meta information from public internet sites and compiles a bibliography entry for bibtex
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    XMLPRPRPR is a SchemaProf plug in that transform XML Schema Profiles in a human understandable (not only readable) style. Today there are two styles simple style and IMS like style. The output formats are HTML and PDF.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo