114 projects for "pdf data mining" with 2 filters applied:

  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    PDFCraft

    PDFCraft

    PDFCraft is a free, privacy-focused PDF toolkit

    PDFCraft is an extensible toolkit for creating, editing, and transforming PDF documents with both a graphical interface and a scripting API, making it useful for users ranging from casual editors to automated document processors. At its core, the project provides a clean, modern UI where you can rearrange pages, annotate text, insert images, fill forms, and export to multiple formats, all without needing a heavyweight commercial PDF suite.
    Downloads: 45 This Week
    Last Update:
    See Project
  • 2
    TeXworks

    TeXworks

    A simple interface for working with TeX documents

    TeXworks is a free and simple working environment for authoring TeX (LaTeX, ConTeXt and XeTeX) documents. Inspired by Dick Koch's award-winning TeXShop program for Mac OS X, it makes entry into the TeX world easier for those using desktop operating systems other than OS X. It provides an integrated, easy-to-use environment for users on other platforms particularly GNU/Linux and Windows and features a clean, simple interface accessible to casual and non-technical users.
    Downloads: 77 This Week
    Last Update:
    See Project
  • 3
    Laravel Invoices

    Laravel Invoices

    Laravel package to generate PDF invoices from customizable parameters

    Laravel Invoices is a Laravel package for generating invoice PDF files from customizable data. It gives developers a simple interface for creating invoices that can be stored, downloaded, or streamed through configured filesystems. The package supports different templates and locales, making it useful for applications that serve customers in multiple regions. It is designed for business systems, SaaS products, admin panels, and client billing workflows that need invoice output without building the full PDF logic from scratch. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    book-to-skill

    book-to-skill

    Turn any technical book PDF into a Claude Code skill

    book-to-skill is a Claude Code skill that turns technical books and documents into reusable AI reference skills. It extracts content from PDFs and EPUBs, then organizes the material so an assistant can study, reference, and apply it while working. The project is useful for transforming dense manuals, textbooks, internal documentation, or technical guides into practical agent-accessible knowledge. It includes an extraction script and a SKILL.md workflow that guides how the resulting content...
    Downloads: 2 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    tinypdf

    tinypdf

    Minimal PDF creation library

    tinypdf is a minimal, zero-dependency PDF generation library that focuses on the core “put content on a page” use case while intentionally skipping heavyweight features. It is designed to be extremely small and approachable, making it a good fit when you want to generate real PDFs in Node/TypeScript without pulling in a large toolkit. The library supports essential primitives like writing text, drawing basic shapes, and placing JPEG images, which covers common needs such as invoices,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Unredact

    Unredact

    A simple tool for reading in poorly redacted documents

    Unredact is a specialized tool that attempts to reconstruct redacted or obscured text in images, PDFs, or screenshots using a combination of image processing and generative AI inference to suggest plausible completions of blurred, black-boxed, or jumbled content. Unlike traditional optical character recognition (OCR), which only reads visible text, Unredact focuses on inferring missing content where redaction has been applied by analyzing surrounding context, font characteristics, and...
    Downloads: 31 This Week
    Last Update:
    See Project
  • 7
    PDF Split and Merge

    PDF Split and Merge

    Split and merge PDF files on any platform

    Split and merge PDF files with PDFsam, an easy-to-use desktop tool with graphical, command line and web interface.
    Leader badge
    Downloads: 295 This Week
    Last Update:
    See Project
  • 8
    ProM is the comprehensive, extensible framework for process mining. Process Mining deals with the a-posteriori analysis of (business) processes using enactment logs.
    Leader badge
    Downloads: 54 This Week
    Last Update:
    See Project
  • 9
    Ada PDF Writer

    Ada PDF Writer

    A standalone, portable package for producing dynamically PDF documents

    PDF_Out is an Ada package for writing easily PDF files dynamically. Enables the automatic production of reports. Standalone and unconditionally portable code. No external resource is needed. More information on... http://apdf.sf.net Alire crate: https://alire.ada.dev/crates/apdf Mirror: https://github.com/zertovitch/ada-pdf-writer
    Leader badge
    Downloads: 22 This Week
    Last Update:
    See Project
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • 10
    QPDF

    QPDF

    PDF transformation/manipulation program + library

    QPDF is a C++ library and set of programs that inspect and manipulate the structure of PDF files. It can encrypt and linearize files, expose the internals of a PDF file, and do many other operations useful to end users and PDF developers.
    Leader badge
    Downloads: 990 This Week
    Last Update:
    See Project
  • 11
    pdfcrack is a command line, password recovery tool for PDF-files.
    Leader badge
    Downloads: 474 This Week
    Last Update:
    See Project
  • 12

    toPDF

    Online service for PDF conversion (to PDF)

    A simple online service for PDF conversion. This project is a simple library and also a web application. It offers a REST service and a simple upload service for synchronous conversion. This library/application doesn't contain conversion libraries because it's a wrapper for existing tools. toPDF currently supports the open source tool PDF Creator (http://www.pdfforge.org) and the commercial solution, easy PDF, from BCL (http://www.pdfonline.com/easypdf/sdk/).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    TeXstudio - A LaTeX Editor

    TeXstudio - A LaTeX Editor

    An integrated writing environment for creating LaTeX documents

    NOTE: Active development has moved to https://github.com/texstudio-org/texstudio Please post issues and feature requests there. TeXstudio is a fully featured LaTeX editor. Our goal is to make writing LaTeX documents as easy and comfortable as possible. Some of the outstanding features of TeXstudio are an integrated pdf viewer with (almost) word-level synchronization, live inline preview, advanced syntax-highlighting, live checking of references, citations, latex commands, spelling and...
    Leader badge
    Downloads: 345 This Week
    Last Update:
    See Project
  • 15
    Chordii

    Chordii

    Easy lead sheets from text input

    ChordPro creates elegant, stafless lead sheets for musicians needing only chords and lyrics. It processes plain text input in ChordPro format and it is a rewrite of the old though still popular Chord/Chordii programs.
    Leader badge
    Downloads: 62 This Week
    Last Update:
    See Project
  • 16
    jPicEdt

    jPicEdt

    Another drawing editor for LaTeX with PSTricks & TikZ

    jPicEdt is an extensible internationalized vector-based drawing editor for LaTeX and related packages (TikZ, PsTricks,...), written in Java. It is also a library of reusable high-level graphic primitives.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 17
    Pdf4Tcl is a library for generating PDF documents from Tcl.
    Leader badge
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18

    queXML

    XML Schema for questionnaires and PDF questionnaire generator

    queXML is a simple XML schema for designing questionnaires. Included are stylesheets to administer the questionnaire in PDF (paper), CASES and LimeSurvey. queXML is compatible with the DDI standard.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    RTextDoc

    RTextDoc

    An editor for structured documents

    RTextDoc is an editor for structured text documents such as LaTeX, AsciiDoc, DocBook. RTextDoc has proofreading capabilities: on-the-fly spelling, instant grammar checking and built-in free dictionaries. RTextDoc has syntax highlighting, bracket matching, folding, document structure browser for sections and labels, bookmarks, manager for LaTeX symbols, an editor for mathematical equations,integrated BibTeX database manager and several tools to convert LaTeX to HTML and back. AsciiDoc...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    PDF Chain

    PDF Chain

    a graphical user interface for the PDF Toolkit (PDFtk)

    PDF Chain is a graphical user interface for the PDF Toolkit (PDFtk). The GUI supports all common features of the command line tool in a comfortable way.
    Leader badge
    Downloads: 37 This Week
    Last Update:
    See Project
  • 21
    Laravel Report Generators

    Laravel Report Generators

    Rapidly Generate Simple Pdf, CSV, & Excel Report Package on Laravel

    Rapidly generate simple PDF reports on Laravel or CSV/Excel reports. This package provides simple PDF, csv & excel report generators to speed up your workflow. It also allows you to stream(), download(), or store() the report seamlessly.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22

    PoDoFo

    A PDF parsing, modification and creation library.

    The PoDoFo library is a free, portable C++ library. It can parse and modify existing PDF files and create new ones from scratch. It also includes several tools to work with PDF files. It features an unique approach which provides access to PDF documents via an object tree. Therefore, PDFs can be created and or manipulated using a simple tree structure. Development of PoDoFo has been moved to GitHub: https://github.com/podofo/podofo Please raise new issues in the GitHub project.
    Leader badge
    Downloads: 126 This Week
    Last Update:
    See Project
  • 23
    Free editor for PDF documents. Complete editing of PDF documents is possible with PDFedit. You can change raw pdf objects (for advanced users) or use many gui functions. Functionality can be easily extended using a scripting language (ECMAScript).
    Leader badge
    Downloads: 124 This Week
    Last Update:
    See Project
  • 24
    LaTeXDraw

    LaTeXDraw

    Vector drawing program for LaTeX using PSTricks

    LaTeXDraw is a graphical drawing editor for LaTeX. LaTeXDraw can be used to 1) generate PSTricks code; 2) directly create PDF or PS pictures.
    Leader badge
    Downloads: 64 This Week
    Last Update:
    See Project
  • 25
    PDF Guru

    PDF Guru

    Merge images and PDFs to a single PDF

    PDF Guru is a simple in use program for merging multiple images and PDF files into a single compact PDF file. It is capable of selecting specific PDF pages or range of pages, which lets you have more control on the output file. Be able to produce compacted, smaller sized files in any operating system. Its features makes it a great, must have, tool for everyone.
    Leader badge
    Downloads: 9 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next