Showing 86 open source projects for "pdf"

View related business solutions
  • Cloud tools for web scraping and data extraction Icon
    Cloud tools for web scraping and data extraction

    Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.

    Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
    Explore 10,000+ tools
  • G-P - Global EOR Solution Icon
    G-P - Global EOR Solution

    Companies searching for an Employer of Record solution to mitigate risk and manage compliance, taxes, benefits, and payroll anywhere in the world

    With G-P's industry-leading Employer of Record (EOR) and Contractor solutions, you can hire, onboard and manage teams in 180+ countries — quickly and compliantly — without setting up entities.
    Learn More
  • 1
    TeXworks

    TeXworks

    A simple interface for working with TeX documents

    TeXworks is a free and simple working environment for authoring TeX (LaTeX, ConTeXt and XeTeX) documents. Inspired by Dick Koch's award-winning TeXShop program for Mac OS X, it makes entry into the TeX world easier for those using desktop operating systems other than OS X. It provides an integrated, easy-to-use environment for users on other platforms particularly GNU/Linux and Windows and features a clean, simple interface accessible to casual and non-technical users.
    Downloads: 109 This Week
    Last Update:
    See Project
  • 2
    Scribus

    Scribus

    Powerful desktop publishing software

    ...Underneath a modern and user-friendly interface, Scribus supports professional publishing features, such as color separations, CMYK and spot colors, ICC color management, and versatile PDF creation.
    Leader badge
    Downloads: 21,698 This Week
    Last Update:
    See Project
  • 3

    FOray

    Modular XSL-FO Implementation for Java.

    FOray is an open-source XSL-FO publishing system that is suitable for converting XML content into PDF and other document formats. Although not yet fully conformant with the XSL-FO standard, it is very useful for many applications.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 4
    OmegaT - multiplatform CAT tool

    OmegaT - multiplatform CAT tool

    The free computer aided translation (CAT) tool for professionals

    OmegaT is a free and open source multiplatform Computer Assisted Translation tool with fuzzy matching, translation memory, keyword search, glossaries, and translation leveraging into updated projects.
    Leader badge
    Downloads: 1,813 This Week
    Last Update:
    See Project
  • Automated RMM Tools | RMM Software Icon
    Automated RMM Tools | RMM Software

    Proactively monitor, manage, and support client networks with ConnectWise Automate

    Out-of-the-box scripts. Around-the-clock monitoring. Unmatched automation capabilities. Start doing more with less and exceed service delivery expectations.
    Learn More
  • 5
    Ltxshell
    Shell for calling LaTeX and accompanying tools (dvi-viewer, dvips, ps-file-viewer, pdf-file-viewer) on Linux and Windows-systems
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    TextExtractor

    TextExtractor

    Extracts plain text from a variety of different file types

    TextExtractor extracts plain text from hundreds of different file types, storing the text extracted in suitably named text files. TextExtractor 1.10 works in six different modes :- Instant Mode - Just select any file and extract the text from it. Batch Mode - Select a group of files and extract the text from all of them in one go. Polling Mode - Watch a folder location, processing new files as they appear there. Hierarchical Mode - Extract Text from files in a directory...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 7
    ant4docbook

    ant4docbook

    ANT4DOCBOOK is an ANT task for DOCBOOK

    ANT4DOCBOOK is an ANT task for DOCBOOK, a semantic markup language for technical documentation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 9
    RTextDoc

    RTextDoc

    An editor for structured documents

    ...RTextDoc has syntax highlighting, bracket matching, folding, document structure browser for sections and labels, bookmarks, manager for LaTeX symbols, an editor for mathematical equations,integrated BibTeX database manager and several tools to convert LaTeX to HTML and back. AsciiDoc files can be converted to DocBook, HTML and PDF files.
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • Dominate AI Search Results Icon
    Dominate AI Search Results

    Generative Al is shaping brand discovery. AthenaHQ ensures your brand leads the conversation.

    AthenaHQ is a cutting-edge platform for Generative Engine Optimization (GEO), designed to help brands optimize their visibility and performance across AI-driven search platforms like ChatGPT, Google AI, and more.
    Learn More
  • 10
    The goal of this tool is to simplify and accelerate the process of creating bookmarks for DjVu and PDF documents. You can see additional information on the project page.
    Leader badge
    Downloads: 53 This Week
    Last Update:
    See Project
  • 11

    SimpleTextFormatter

    STF automatically generates documentation

    STF is a system of automatically generating documentation under control of a program or a script. It is frequently used to automatically generate test reports. STF is also used to clean up the output of a process and turn it into a nice looking report.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    ConcatPDF

    PDF Concatenation Tool

    ConcatPDF is the tool to concatenate PDF files. It can concatenate, extract, encrypt, decrypt, configure PDF files, convert image files to PDF. GUI version and CUI version are both available. iText.NET is iText porting on .NET Framework by J#. This library allows you to generate PDF, (X)HTML, XML, RTF files on Microsoft.NET Framework including ASP.NET.
    Downloads: 32 This Week
    Last Update:
    See Project
  • 13
    Web Book Downloader

    Web Book Downloader

    Download websites as e-book: pdf, txt, epub.

    This application allows user to download chapters from website in 3 ways: - from table of contents; - from range: first chapter address, last chapter address; - by crawling from first chapter to n; In settings you can customize language, input(website encoding) for simplicity output is in the same encoding. If you want your language add new class into strings package, and new fields into Settings class and GUI menu(initialize method).
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. ...
    Leader badge
    Downloads: 166 This Week
    Last Update:
    See Project
  • 15
    mbFXWords

    mbFXWords

    Analyze text. Diagonal read subject, predicate, obj. Search other pdf.

    ... - Divide plain text: subject, predicate, object. - Count words: stemming. - Search for similar content: pdf's. Gives out subject, predicate and object of sentences of pdf and plain text files. Provides comfortable GUI. Automatic language detection.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    PDF To Text Watcher

    PDF To Text Watcher

    Profile-based watcher for automated processing of PDF tiles to text.

    Watches folders to automate transforming PDF files into text with optional metadata extraction. Requires the XPDF tools, which you must source separately. Lets you set up multiple profiles, modify profiles 'hot' without saving and move or delete the source PDFs after processing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    FireTeX: LaTeX Editor and Compiler

    FireTeX: LaTeX Editor and Compiler

    Edit Your files LaTeX and tex

    FireTeX, web based LaTeX editor complete, is a powerful, intuitive and stocked with useful functions for exporting the results in three useful formats. An editor with LaTeX compiler, highlight code, advanced search / replace and filesystem API HTML5. ======== Android app available on Play Store > https://play.google.com/store/apps/details?id=com.ulmdesign.ulmtex ======== Update 30.06.2017 Windows 7 and later and macOS 10.9 and later are supported. == Browser Extensions == Add-on...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    A Swiss Army Knife GUI application for PDF documents: combine, split, rotate, reorder (n-up, booklet), watermark, edit bookmarks/fileinfo/pagetransition, compress, encrypt, decrypt, sign, repair, edit attachments and more.
    Leader badge
    Downloads: 124 This Week
    Last Update:
    See Project
  • 19
    PDF Clown

    PDF Clown

    General-Purpose PDF Library for Java and .NET

    PDF Clown is a general-purpose Java and .NET library for manipulating PDF files through multiple abstraction layers, rigorously adhering to PDF 1.7 specification (ISO 32000-1). This project aims to provide a universal access to PDF files (creation, reading, editing, rendering...) through an accurate and elegant object-oriented API.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 20
    This project has moved to https://github.com/workinghard/GuitarTeX2 GuitarTeX2 is based on the idea of Chord. It takes a Chord file containing Chordpro directives to produce good-looking and easy-to-play song sheets for guitarists in PostScript or PDF format. GuitarTeX2 is a further development of GuitarTeX.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    chmProcessor: Word/HTML to CHM converter

    chmProcessor: Word/HTML to CHM converter

    MS Word / HTML to CHM / Web Help converter

    A tool to generate compiled help files (CHM) and Java Help files from MS Word or HTML files. It splits the document on different topics pages by the "titles" sections. It can too generate a web site, a PDF and a XPS with the help content.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 22

    Detexter

    Detexter is an app designed to extract text from PDF files.

    Detexter lets you extract text from multiple PDF files. Detexter uses the PDFBox library for its text extraction.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    ...It reads a high-level description of a document similar in style to LaTeX and produces a PostScript file which can be printed on most laser printers. Plain text and PDF output are also available.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    jPod is a rich PDF manipulation and rendering framework. A complete rendering library based on jPod is available here at "jPodRenderer". To see jPod & jPodRenderer at work, have a look at www.cabaret-solutions.com
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    FreePdf

    FreePdf

    Permanently rotate or merge Pdf's

    ...Rotate or merge multiple pdf's in bulk, not one by one. Allows specific page numbers to be manipulated and allows you to choose the output path. v1.3 allows the deletion of pages from a Pdf prior to rotating or merging by double clicking on a file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next