47 projects for "open pdf" with 2 filters applied:

  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Atera - an All-in-one platform for IT management Icon
    Atera - an All-in-one platform for IT management

    Ideal for IT departments and MSPs (managed service providers)

    Your IT essentials, integrated & elevated. Take your IT management from automated to autonomous, download Atera's agent to start your free trial!
    Try Atera now
  • 1
    Open Semantic Search

    Open Semantic Search

    Open source semantic search and text analytics for large document sets

    Open Semantic Search includes an ETL framework that can ingest documents, process them through analysis steps, and enrich the data with extracted information such as named entities and metadata. It also supports optical character recognition to extract text from images and scanned documents, including images embedded inside PDF files.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    AI PDF Chatbot LangChain

    AI PDF Chatbot LangChain

    AI PDF chatbot agent built with LangChain & LangGraph

    AI PDF Chatbot LangChain is a full-stack template for building conversational agents that can ingest and answer questions about PDF documents. The project demonstrates how to combine LangChain and LangGraph with a vector database to enable retrieval-augmented question answering over user-provided files. It includes both frontend and backend components, making it suitable as a production starting point rather than just a minimal demo. The system parses uploaded PDFs into document chunks,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    MarkPDFDown

    MarkPDFDown

    A high-quality PDF to Markdown tool based on large language model

    MarkPDFdown is an open-source document processing tool designed to convert PDF files into structured Markdown output that can be easily used for documentation, content pipelines, and AI processing workflows. The project focuses on extracting text, formatting, and structural information from complex PDF documents and transforming that information into clean Markdown that preserves the original hierarchy of headings, paragraphs, tables, and lists.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    Scribe.js

    Scribe.js

    JavaScript OCR and text extraction for images and PDFs

    Scribe.js is a JavaScript library that provides Optical Character Recognition (OCR) and text extraction capabilities for both images and PDF documents, aimed at developers who want to build OCR features directly into their applications. The library can take image files (such as PNG or JPEG) and recognize the text they contain, and it can also extract text from PDF files that either already contain text or are image-based scans, using modern web standards and WebAssembly under the hood. In...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    Magic Resume

    Magic Resume

    free online AI resume editor

    ...It supports customizable themes and layouts, enabling users to tailor the design to different industries or personal branding preferences. Magic Resume also includes export functionality for generating polished PDF documents directly in the browser, making it practical for job applications.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    Desktop Commander MCP

    Desktop Commander MCP

    AI-powered MCP server for desktop file and terminal automation

    Desktop Commander MCP is an advanced Model Context Protocol server designed to extend AI assistants with direct control over a user’s local machine, including the file system and terminal. It integrates with clients like Claude Desktop to enable AI-driven workflows such as editing files, executing commands, and automating development tasks from a single conversational interface. Desktop Commander MCP builds on top of an MCP filesystem server and enhances it with powerful search, replace, and...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    Semantra

    Semantra

    Multi-tool for semantic search

    Semantra is an open-source semantic search tool designed to help users explore large collections of documents by meaning rather than simple keyword matching. The software analyzes text and PDF documents stored locally and creates embeddings that allow queries to retrieve results based on conceptual similarity. It is primarily intended for individuals who need to extract insights from large document collections, including researchers, journalists, students, and historians. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    canvas-editor

    canvas-editor

    Canvas-based WYSIWYG rich text editor with advanced layout tools

    canvas-editor is a browser-based rich text editor that renders content using HTML5 Canvas and SVG instead of traditional DOM-based approaches. It is designed to provide a WYSIWYG editing experience similar to word processors, enabling precise control over layout, rendering, and document structure. canvas-editor supports a wide range of formatting and document features, including text styling, tables, images, and embedded elements, all managed through a structured data model. Its architecture...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    DeepSeek-OCR 2

    DeepSeek-OCR 2

    Visual Causal Flow

    DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents...
    Downloads: 8 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Resume-Matcher

    Resume-Matcher

    Improve your resumes with Resume Matcher

    Resume-Matcher is a command-line application that compares resumes against job descriptions using natural language processing. It provides a compatibility score based on keyword relevance and highlights areas where the resume aligns—or doesn't—with the target role. Designed for job seekers and HR professionals, it helps improve resume tailoring and streamlines candidate screening.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    JimuReport

    JimuReport

    Open source drag-and-drop reporting and dashboard builder platform

    JimuReport is an open source data visualization and reporting platform designed to help developers and organizations build reports, dashboards, and large screen data displays through a visual interface. It provides an online report designer that uses an Excel-like editing experience, allowing users to construct reports with drag-and-drop components and cell-based layouts. It focuses on simplifying complex report development by enabling visual configuration instead of manual coding....
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    LandPPT

    LandPPT

    An LLM-based presentation generation platform

    LandPPT is an open-source AI platform that automatically generates professional presentation slides using large language models. The system allows users to create complete PowerPoint presentations simply by entering a topic or uploading source documents such as PDFs, Word files, or Markdown notes. Using natural language processing and structured content generation, the platform produces presentation outlines and converts them into fully formatted slide decks. The application integrates...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    text-extract-api

    text-extract-api

    Document (PDF, Word, PPTX ...) extraction and parse API

    text-extract-api is an open-source service designed to extract readable text from a wide variety of document formats through a simple API interface. The project focuses on converting complex files such as PDFs, images, scanned documents, and office files into structured plain text that can be processed by downstream applications or language models. Instead of requiring developers to integrate multiple document parsing libraries individually, the system centralizes text extraction...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    DeepSeek Prover V2

    DeepSeek Prover V2

    Advancing Formal Mathematical Reasoning via Reinforcement Learning

    DeepSeek-Prover-V2 is DeepSeek’s specialized model for formal theorem proving, particularly targeting proof in Lean 4. The repository describes how they use recursive proof decomposition by prompting DeepSeek-V3 to break complex theorems into subgoals, synthesize proof sketches, and then combine them to bootstrap training data. They then fine-tune via reinforcement learning with binary correct/incorrect feedback to integrate informal reasoning with formal proof behavior. The repo releases...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    PowerPoint-ist

    PowerPoint-ist

    Web presentation editor replicating many PowerPoint features online

    PPTist is a web-based presentation editing application designed to replicate many of the commonly used features found in traditional slide presentation software. It allows users to create, edit, and present slide decks directly within a web browser while maintaining a desktop-like editing experience. PPTist is built with Vue 3 and TypeScript and focuses on providing a highly interactive slide editing environment with extensive customization and extension potential. PPTist supports a wide...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    Anything to NotebookLM

    Anything to NotebookLM

    Multi-source content processor for NotebookLM

    Qiaomu Anything to NotebookLM is a Claude Code skill that turns many types of source material into structured NotebookLM-ready outputs. It is built for users who want to convert articles, web pages, videos, PDFs, office files, podcasts, images, and search results into more usable study or presentation formats. The project uses natural-language commands, so the user can ask for a podcast, slide deck, mind map, report, quiz, flashcards, or infographic without manually building the workflow. It...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Extractous

    Extractous

    Fast and efficient unstructured data extraction

    Extractous is a Rust-based unstructured data extraction library focused on fast local parsing of documents and other content-heavy files. Its purpose is to extract text and metadata efficiently from formats such as PDF, Word, HTML, email archives, images, and more, without depending on external APIs or separate parsing servers. The project emphasizes performance and low memory usage, and its maintainers describe it as a local-first alternative to heavier extraction stacks. For broader format...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    myGPTReader

    myGPTReader

    AI Slack bot for reading, summarizing, and chatting with content

    myGPTReader is an AI-powered Slack bot designed to help users read, summarize, and interact with various types of digital content through conversational interfaces. It enables users to quickly understand web pages, documents, and even video content by transforming them into interactive discussions rather than static reading experiences. myGPTReader supports a wide range of file formats, including eBooks, PDFs, and text-based documents, making it flexible for both casual and professional use...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    ArXiv MCP Server

    ArXiv MCP Server

    A Model Context Protocol server for searching and analyzing arXiv

    arxiv-mcp-server bridges AI assistants and the arXiv repository through a clean MCP interface, enabling search, metadata retrieval, and content access without bespoke scraping. With simple tools like “search” and “fetch,” an agent can find papers, pull abstracts, and download PDFs for downstream summarization or analysis. The project includes packaging and CI to publish to PyPI, plus tests and linting for reliability. Issue threads show feature requests such as extracting embedded LaTeX and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Controllable-RAG-Agent

    Controllable-RAG-Agent

    This repository provides an advanced RAG

    Controllable-RAG-Agent is an advanced Retrieval-Augmented Generation (RAG) system designed specifically for complex, multi-step question answering over your own documents. Instead of relying solely on simple semantic search, it builds a deterministic control graph that acts as the “brain” of the agent, orchestrating planning, retrieval, reasoning, and verification across many steps. The pipeline ingests PDFs, splits them into chapters, cleans and preprocesses text, then constructs vector...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    CVPR 2025

    CVPR 2025

    Collection of CVPR 2025 papers and open source projects

    CVPR 2025 curates accepted CVPR 2025 papers and pairs them with their corresponding code implementations when available, giving researchers and practitioners a fast way to move from reading to reproducing. It organizes entries by topic areas such as detection, segmentation, generative models, 3D vision, multi-modal learning, and efficiency, so you can navigate the year’s output efficiently. Each paper entry typically includes a title, author list, and links to the paper PDF and official or...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    A GUI to ease the process of producing a multipage PDF from a scan. gscan2pdf should work on almost any Linux/BSD machine.
    Leader badge
    Downloads: 102 This Week
    Last Update:
    See Project
  • 23
    chessPDFBrowser

    chessPDFBrowser

    Chess application whichs allows working with chess PDF books and PGNs.

    Chess application which allows working with PDFs and PGNs. You can work with the chess games of the PDF and edit their tree of variants. Graphical environment. Standard PGN TAGs. PGN comments. Ocr like (Fen string detection from chess board position images). Connection to Uci chess engines (like stockfish). Position analysis, full game analysis. You can now play games against uci engines. pdf2pgn command line command included. Detailed documentation. Multilanguage...
    Downloads: 20 This Week
    Last Update:
    See Project
  • 24
    Provides optical character recognition (OCR) solutions for Vietnamese language.
    Leader badge
    Downloads: 161 This Week
    Last Update:
    See Project
  • 25
    LangChain Extract

    LangChain Extract

    Did you say you like data?

    LangChain Extract is an open-source reference application designed to demonstrate how large language models can be used to extract structured data from unstructured text and document files. The project implements a lightweight web service that allows developers to define extraction schemas and apply them to various sources such as plain text, HTML, or PDF documents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo