Showing 129 open source projects for "pdf to"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    OCRmyPDF

    OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files

    OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.
    Downloads: 109 This Week
    Last Update:
    See Project
  • 2
    MinerU

    MinerU

    A high-quality tool for convert PDF to Markdown and JSON

    MinerU is an open-source, high-quality document extraction toolkit focused on converting PDFs (and other document formats) into structured Markdown and JSON. It leverages OCR and layout analysis to preserve semantic structure and metadata, ideal for research and data science workflows.
    Downloads: 28 This Week
    Last Update:
    See Project
  • 3
    AI PDF Chatbot LangChain

    AI PDF Chatbot LangChain

    AI PDF chatbot agent built with LangChain & LangGraph

    AI PDF Chatbot LangChain is a full-stack template for building conversational agents that can ingest and answer questions about PDF documents. The project demonstrates how to combine LangChain and LangGraph with a vector database to enable retrieval-augmented question answering over user-provided files. It includes both frontend and backend components, making it suitable as a production starting point rather than just a minimal demo.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    GROBID

    GROBID

    A machine learning software for extracting information

    ...The extraction here covers the usual bibliographical information (e.g. title, abstract, authors, affiliations, keywords, etc.). References extraction and parsing from articles in PDF format, around .87 F1-score against on an independent PubMed Central set of 1943 PDF containing 90,125 references, and around .89 on a similar bioRxiv set of 2000 PDF (using the Deep Learning citation model). All the usual publication metadata are covered (including DOI, PMID, etc.).
    Downloads: 6 This Week
    Last Update:
    See Project
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 5
    Tesseract OCR

    Tesseract OCR

    Open Source OCR Engine

    ...Tesseract can recognize over 100 languages out-of-the-box, and can be trained to recognize other languages. It supports various output formats, including plain text, HTML, PDF and more. It also has unicode (UTF-8) support.
    Downloads: 10,111 This Week
    Last Update:
    See Project
  • 6
    Umi-OCR

    Umi-OCR

    OCR software, free and offline

    ...It includes a highly efficient offline OCR engine with built-in multilingual recognition libraries, so users can extract text across multiple languages with high accuracy directly on their machines. The software supports flexible usage patterns including screenshot capture OCR, batch processing of large sets of images or documents, PDF parsing, QR code detection, and layout-aware paragraph output. Users can interact with Umi-OCR through a graphical interface, command-line options, or HTTP interfaces, making it adaptable to both casual desktop usage and programmatic automation. Because the project is open source, developers can inspect, modify, and extend its capabilities, and plugins allow for different recognition engines or enhanced features.
    Downloads: 74 This Week
    Last Update:
    See Project
  • 7
    Local-NotebookLM

    Local-NotebookLM

    Googles NotebookLM but local

    Local-NotebookLM is a local AI tool for turning PDF documents into generated audio content. It works like a self-hosted alternative to NotebookLM-style document-to-audio workflows. The system extracts and processes PDF text, sends the content through an LLM, and converts the result into speech with configurable voices. Users can generate podcasts, summaries, interviews, lectures, debates, tutorials, news reports, executive briefs, and other formats.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 8
    MarkPDFDown

    MarkPDFDown

    A high-quality PDF to Markdown tool based on large language model

    MarkPDFdown is an open-source document processing tool designed to convert PDF files into structured Markdown output that can be easily used for documentation, content pipelines, and AI processing workflows. The project focuses on extracting text, formatting, and structural information from complex PDF documents and transforming that information into clean Markdown that preserves the original hierarchy of headings, paragraphs, tables, and lists.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Unlimited OCR Works

    Unlimited OCR Works

    Welcome the Era of One-shot Long-horizon Parsing

    ...It is designed to push OCR beyond short, isolated image recognition and into longer document understanding workflows. The project supports single-image parsing as well as multi-page and PDF-style parsing by converting pages into images. It provides inference paths for Hugging Face Transformers, vLLM, and SGLang, which gives users several deployment options. The repository also includes example code for batch inference over image folders or PDF inputs. Overall, it is useful for researchers and developers who need advanced OCR, long-document parsing, and model-based extraction from complex visual documents.
    Downloads: 4 This Week
    Last Update:
    See Project
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 10
    Ollama RAG Chatbot

    Ollama RAG Chatbot

    Chat with multiple PDFs locally

    ...The main value of the project is its ability to process multiple PDF inputs and turn them into a question-answering workflow centered on document retrieval. With Docker support, script-based setup, optional ngrok exposure, and a clear local run path, it serves as a compact starter project for people who want a hands-on, self-hosted PDF chat system.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Scribe.js

    Scribe.js

    JavaScript OCR and text extraction for images and PDFs

    Scribe.js is a JavaScript library that provides Optical Character Recognition (OCR) and text extraction capabilities for both images and PDF documents, aimed at developers who want to build OCR features directly into their applications. The library can take image files (such as PNG or JPEG) and recognize the text they contain, and it can also extract text from PDF files that either already contain text or are image-based scans, using modern web standards and WebAssembly under the hood. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Paperless-ngx

    Paperless-ngx

    A community-supported supercharged version of paperless

    Paperless-ngx is a community-supported open-source document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 13
    Docling

    Docling

    Get your documents ready for gen AI

    ...The project focuses on converting and parsing many document formats into a unified structured representation that downstream systems can easily consume. It supports advanced PDF understanding, including layout detection, table extraction, and reading order analysis, enabling high-fidelity document intelligence pipelines. Docling is designed to run efficiently on commodity hardware and can be used both as a Python API and a command-line tool. Its modular architecture allows developers to extend functionality and integrate specialized models for tasks such as OCR and audio transcription. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    canvas-editor

    canvas-editor

    Canvas-based WYSIWYG rich text editor with advanced layout tools

    ...Its architecture is modular, allowing developers to extend functionality through plugins, custom commands, and event hooks. It includes support for page-based layouts with headers, footers, pagination, and print-ready output, including PDF generation. It also provides interactive components such as form controls and context menus, making it suitable for building complex document editing systems.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 15
    MetaScreener

    MetaScreener

    AI-powered tool for efficient abstract and PDF screening

    ...Instead of manually reviewing hundreds or thousands of documents, researchers can use MetaScreener to apply machine learning techniques that assist with classification and prioritization of candidate papers. The platform can analyze both abstracts and full PDF documents, enabling automated filtering based on research criteria defined by the user. By incorporating natural language processing techniques, the system can identify potentially relevant studies and reduce the workload associated with manual screening.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Open CoDesign

    Open CoDesign

    Open-source Claude Design alternative

    Open CoDesign is an open-source, desktop AI design tool that transforms natural language prompts into fully structured design artifacts such as prototypes, slide decks, and marketing assets. It is designed as a local-first alternative to cloud-based design tools, allowing users to run everything on their own machine while bringing their own AI model and API keys. The system supports multiple model providers and integrates directly with existing developer tools, enabling seamless workflows...
    Downloads: 127 This Week
    Last Update:
    See Project
  • 17
    Desktop Commander MCP

    Desktop Commander MCP

    AI-powered MCP server for desktop file and terminal automation

    ...It allows users to run terminal commands with streaming output, manage long-running processes, and even execute code in memory without saving files. It also supports working with structured and document formats such as Excel, PDF, and DOCX, enabling AI to read, modify, and generate these files directly.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    Semantra

    Semantra

    Multi-tool for semantic search

    Semantra is an open-source semantic search tool designed to help users explore large collections of documents by meaning rather than simple keyword matching. The software analyzes text and PDF documents stored locally and creates embeddings that allow queries to retrieve results based on conceptual similarity. It is primarily intended for individuals who need to extract insights from large document collections, including researchers, journalists, students, and historians. The system runs from the command line and automatically launches a local web interface where users can perform interactive searches and examine document passages related to a query. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    Easy DataSet

    Easy DataSet

    A powerful tool for creating datasets for LLM fine-tuning

    Easy DataSet is a comprehensive open-source tool designed to make creating high-quality datasets for large language model fine-tuning, retrieval-augmented generation (RAG), and evaluation as easy and automated as possible by providing intuitive interfaces and powerful parsing, segmentation, and labeling tools. It supports ingesting domain-specific documents in a wide range of formats — including PDF, Markdown, DOCX, EPUB, and plain text — and can intelligently segment, clean, and structure content into rich datasets tailored for downstream LLM training needs. The system includes automated question-generation capabilities, hierarchical label trees, and answer generation pipelines that use LLM APIs to produce coherent paired data with customizable templates. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    ChatGPT Academic

    ChatGPT Academic

    ChatGPT extension for scientific research work

    ChatGPT extension for scientific research work, specially optimized academic paper polishing experience, supports custom shortcut buttons, supports custom function plug-ins, supports markdown table display, double display of Tex formulas, complete code display function, new local Python/C++/Go project tree Analysis function/Project source code self-translation ability, newly added PDF and Word document batch summary function/PDF paper full-text translation function. All buttons are dynamically generated by reading functional.py, you can add custom functions at will, and liberate the pasteboard. Support for markdown tables output by GPT. If the output contains a formula, it will be displayed in tex form and rendered form at the same time, which is convenient for copying and reading.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    Hiring Agent

    Hiring Agent

    AI agent to evaluate and score resumes

    Hiring Agent is an AI-powered resume evaluation pipeline for screening technical candidates. It reads a resume PDF and converts the content into Markdown-like text. It then uses a local or hosted language model to extract structured candidate information into sectioned JSON. The system can enrich that resume data with GitHub profile and repository signals when a profile is available. After the data is collected, it produces an explainable evaluation with category scores, supporting evidence, bonus points, and deductions. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Magic Resume

    Magic Resume

    free online AI resume editor

    ...It supports customizable themes and layouts, enabling users to tailor the design to different industries or personal branding preferences. Magic Resume also includes export functionality for generating polished PDF documents directly in the browser, making it practical for job applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Papermerge

    Papermerge

    Open Source Document Management System for Digital Archives

    ...Each user can be assigned different permissions to perform only a specific kind of action e.g. view only documents from a specific folder. OCR technology is vital part of Papermerge. It extracts text information from scanned documents, PDF, JPEG, TIFF files.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 24
    AnythingLLM

    AnythingLLM

    The all-in-one Desktop & Docker AI application with full RAG and AI

    A full-stack application that enables you to turn any document, resource, or piece of content into a context that any LLM can use as references during chatting. This application allows you to pick and choose which LLM or Vector Database you want to use as well as supporting multi-user management and permissions. AnythingLLM is a full-stack application where you can use commercial off-the-shelf LLMs or popular open-source LLMs and vectorDB solutions to build a private ChatGPT with no...
    Downloads: 68 This Week
    Last Update:
    See Project
  • 25
    NAPS2 - Not Another PDF Scanner

    NAPS2 - Not Another PDF Scanner

    Scan documents to PDF and other file types, as simply as possible.

    ...NAPS2 is a document scanning application with a focus on simplicity and ease of use. Scan your documents from WIA- and TWAIN-compatible scanners, organize the pages as you like, and save them as PDF, TIFF, JPEG, PNG, and other file formats. Available on Windows, Mac, and Linux. NAPS2 is currently available in over 40 different languages. Want to see NAPS2 in your preferred language? Help translate! See the wiki for more details.
    Leader badge
    Downloads: 712 This Week
    Last Update:
    See Project
Auth0 Logo