Showing 7 open source projects for "extraction"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    MinerU

    MinerU

    A high-quality tool for convert PDF to Markdown and JSON

    MinerU is an open-source, high-quality document extraction toolkit focused on converting PDFs (and other document formats) into structured Markdown and JSON. It leverages OCR and layout analysis to preserve semantic structure and metadata, ideal for research and data science workflows.
    Downloads: 24 This Week
    Last Update:
    See Project
  • 2
    PDFPatcher

    PDFPatcher

    A versatile toolkit for PDF manipulation

    PDFPatcher (aka “PDF补丁丁”) is a versatile toolkit for PDF manipulation—editing document metadata, bookmarks, page layout, content restrictions, rotation, compression, merging/splitting, image extraction, and more, all within an intuitive interface. Merge/split PDFs or images, preserve or add bookmarks, and set page dimensions. Batch style/color/target changes, regex/XPath search/replace, mid‑page positioning. Modify PDF metadata, page numbers, links, initial view mode, and remove open actions.
    Downloads: 29 This Week
    Last Update:
    See Project
  • 3
    PDF Reader for Windows 7

    PDF Reader for Windows 7

    Free PDF reader for Windows 7

    ...Perfectly optimized for the Windows 7 environment, it offers rapid performance and compatibility with various PDF files, ensuring that even large documents load quickly without lag. In addition to its clean design, PDF Reader for Windows 7 includes essential features like zoom, rotation, and text extraction, giving users all the tools they need to interact with PDFs comfortably. Whether you're reviewing documents for work, school, or leisure, this reader is a must-have for any Windows 7 user.
    Downloads: 91 This Week
    Last Update:
    See Project
  • 4
    PDF of Death
    PDF of Death is a PDF editor and more. Images → PDF (multiple images to single PDF), PDF → Images (PNG/JPG extraction) Word ↔ PDF (DOCX conversion both ways), TXT → PDF Markdown → PDF, Merge multiple PDFs into one, Split PDF (every page into separate files) Split PDF by page range, Password protect PDFs Compression of PDF's And much more... Sorry, Windows only, Mac and Linux were driving me insane. *Note: When loading PDF's underlined text will not be displayed.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    PDFLayoutTextStripper

    PDFLayoutTextStripper

    Converts a pdf file into a text file while keeping the layout

    Converts a PDF file into a text file while keeping the layout of the original PDF. Useful to extract the content from a table or a form in a PDF file. PDFLayoutTextStripper is a subclass of PDFTextStripper class (from the Apache PDFBox library).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6

    PDFtk Bookmarks Editor

    GUI for updating PDF bookmarks using PDF Toolkit (PDFtk) on Windows

    Free and open source GUI application for updating bookmarks in a PDF document using the PDF Toolkit command line tool, PDFtk Server. User selects the PDF via drag and drop and then edits the bookmark entries in a text file using a simple, 1-line data format. Program handles everything else in response to a few user button clicks. OS: Windows. Author: David King. License: GPLv3.
    Downloads: 26 This Week
    Last Update:
    See Project
  • 7

    Dualword-PMC

    PMC browser

    PubMed Central browser. Source code: http://github.com/dualword/dualword-pmc/
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB