Search Results for "python pdf scaper" - Page 3

Showing 333 open source projects for "python pdf scaper"

View related business solutions
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 1
    shuyuan

    shuyuan

    Reading book source

    shuyuan is a project oriented around reading and knowledge consumption, especially targeting large-scale text content such as books, articles, or educational material. The name suggests “academy” or “study hall,” and the tool aims to help users ingest, organize, and manage reading content — possibly offering features like text parsing, annotation, metadata generation, translation, or storage for later reference. The repository is set up to support document ingestion, indexing, and maybe some...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    bookdown

    bookdown

    Authoring Books and Technical Documents with R Markdown

    ...A markup language easier to learn than LaTeX, and to write elements such as section headers, lists, quotes, figures, tables, and citations. Multiple choices of output formats: PDF, LaTeX, HTML, EPUB, and Word. Possibility of including dynamic graphics and interactive applications (HTML widgets and Shiny apps) Support for languages other than R, including C/C++, Python, and SQL, etc. LaTeX equations, theorems, and proofs work for all output formats. Can be published to GitHub, bookdown.org, and any web servers. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Perf Book

    Perf Book

    The book "Performance Analysis and Tuning on Modern CPU"

    This project is a practical guide to performance analysis and tuning on modern CPUs, bridging microarchitecture details with hands-on profiling. It explains how caches, TLBs, prefetchers, branch predictors, and out-of-order execution influence real program speed, then connects those concepts to concrete optimization strategies. Readers learn how to design trustworthy benchmarks, avoid measurement traps (warmup, turbo, frequency scaling), and interpret hardware performance counters. The book...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    LLMStack

    LLMStack

    No-code multi-agent framework to build LLM Agents, workflows

    LLMStack is a no-code platform for building generative AI agents, workflows and chatbots, connecting them to your data and business processes. Build tailor-made generative AI agents, applications and chatbots that cater to your unique needs by chaining multiple LLMs. Seamlessly integrate your own data, internal tools and GPT-powered models without any coding experience using LLMStack's no-code builder. Trigger your AI chains from Slack or Discord. Deploy to the cloud or on-premise.
    Downloads: 8 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    Khoj

    Khoj

    An AI personal assistant for your digital brain

    Get more done with your open-source AI personal assistant. Khoj is a desktop application to search and chat with your notes, documents, and images. It is an offline-first, open-source AI personal assistant that is accessible from Emacs, Obsidian or your Web browser. Khoj is a thinking tool that is transparent, fun, and easy to engage with. You can build faster and better by using Khoj to search and reason across all your data sources. Khoj learns from your notes and documents to function as...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    PdfBooklet
    PdfBooklet is a Python Gtk application which allows to make books or booklets from existing pdf files. It can also adjust margins, rotate, scale, merge files or extract pages.
    Leader badge
    Downloads: 165 This Week
    Last Update:
    See Project
  • 7
    ArXiv MCP Server

    ArXiv MCP Server

    A Model Context Protocol server for searching and analyzing arXiv

    arxiv-mcp-server bridges AI assistants and the arXiv repository through a clean MCP interface, enabling search, metadata retrieval, and content access without bespoke scraping. With simple tools like “search” and “fetch,” an agent can find papers, pull abstracts, and download PDFs for downstream summarization or analysis. The project includes packaging and CI to publish to PyPI, plus tests and linting for reliability. Issue threads show feature requests such as extracting embedded LaTeX and...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    Libros de Programación en Español

    Libros de Programación en Español

    List of programming books in Spanish for free

    Libros de Programación en Español is a curated list of free programming books in Spanish, organized by topic and technology so learners can find high-quality materials without cost. The README is structured as an index with general programming books, followed by sections for specific languages such as JavaScript, TypeScript, Python, Ruby, Rust, PHP, Haskell, Go, Kotlin, Java, and R.Each entry includes the book title, author, and a link to the official or legal free version (PDF, HTML, eBook, etc.), focusing on resources that are legitimately available. Beyond languages, the list also covers frameworks and libraries (like React and Qwik), tools (such as Git), and databases (SQL), grouping them in separate sections for easier browsing. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    pdf2wordx

    pdf2wordx

    Convertir "pdf" a documentos ".docx"

    `pip install pdf2wordx` Este proyecto usa "Tkinter" V8.6 y usa "pdf2docx" V0.5.8 para realizar las conversiones de PDF a DOCX. El programa es fácil de usar, solo se dene seleccionar el archivo PDF, Bucar la carpeta donde se guardará el documento DOCX, finalmente de click en el botón "Convertir", el documento se convertirá y guardará en la ruta especificada.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • 10
    Mastering Bitcoin

    Mastering Bitcoin

    Mastering Bitcoin 3rd Edition - Programming the Open Blockchain

    The bitcoinbook repository contains the source code for Mastering Bitcoin, the authoritative open-source book by Andreas M. Antonopoulos on Bitcoin and cryptocurrency technologies. Written in a collaborative and continuously updated format using Markdown and AsciiDoc, the book serves as a comprehensive technical guide for developers, engineers, and system architects who want to understand how Bitcoin works. It covers the protocol, cryptography, peer-to-peer architecture, wallets, mining, and...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    Little Book of Linear Algebra

    Little Book of Linear Algebra

    A concise, beginner-friendly introduction to the core ideas of linear

    This is a concise, beginner-friendly introduction to the fundamental concepts of linear algebra, intended to give readers intuition without overwhelming detail. The material is organized into chapters covering vectors, matrices, linear systems, vector spaces, eigenvalues/eigenvectors, and other central topics, each with worked examples and explanations. There is also a companion “LAB” section for hands-on exploration (e.g. using Python/NumPy) to help cement the connections between algebraic...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 12
    Jina

    Jina

    Build cross-modal and multimodal applications on the cloud

    Jina is a framework that empowers anyone to build cross-modal and multi-modal applications on the cloud. It uplifts a PoC into a production-ready service. Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer. Build applications that deliver fresh insights from multiple data types such as text, image, audio, video, 3D mesh, PDF with Jina AI’s DocArray. Polyglot gateway that supports gRPC, Websockets, HTTP,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    libvips

    libvips

    A fast image processing library with low memory needs

    ...Images can have any number of bands. It supports a good range of image formats, including JPEG, JPEG2000, JPEG-XL, TIFF, PNG, WebP, HEIC, AVIF, FITS, Matlab, OpenEXR, PDF, SVG, HDR, PPM / PGM / PFM, CSV, GIF, Analyze, NIfTI, DeepZoom, and OpenSlide. It can also load images via ImageMagick or GraphicsMagick, letting it work with formats like DICOM. It comes with bindings for C, C++, and the command-line. Full bindings are available for Ruby, Python, PHP, C# / .NET, Go, and Lua.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    Controllable-RAG-Agent

    Controllable-RAG-Agent

    This repository provides an advanced RAG

    Controllable-RAG-Agent is an advanced Retrieval-Augmented Generation (RAG) system designed specifically for complex, multi-step question answering over your own documents. Instead of relying solely on simple semantic search, it builds a deterministic control graph that acts as the “brain” of the agent, orchestrating planning, retrieval, reasoning, and verification across many steps. The pipeline ingests PDFs, splits them into chapters, cleans and preprocesses text, then constructs vector...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    PyExe: Cbz to pdf converter [I.S.A]

    PyExe: Cbz to pdf converter [I.S.A]

    PyExe: Cbz to pdf converter [Improved.Simplified.Alternative]

    Converts Comic book zip (.cbz) into PDF files. Compatible only for windows OS.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16

    Create Index from PDF

    PDF Indexing Script: Searches PDF for words, records page numbers

    This Python script helps automate the process of creating an index for a PDF document. It reads a list of words from a text file, searches through each page of the PDF, and records the page numbers where each word appears. The script accounts for the first 24 pages of the PDF that use Roman numerals (i-xxiv) and adjusts the page numbers accordingly.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18

    realwatermark

    A Python application to add watermarks (text or image) to PDF files

    A Python application to add watermarks (text or image) to PDF files, converts them into image and back to PDF with options for OCR and compression.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    pdf combiner merger converter splitter

    pdf combiner merger converter splitter

    PDF Combiner is a user-friendly, GUI-based tool built in

    PDF Combiner is a user-friendly open source free to use, GUI-based tool for combining, pdf to excel, pdf to word, image to pdf, zip, unzip annotate and splitting PDF files. It is easy to use, supports multiple file insert and delete and process, and allows you to adjust the order of files before combining.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 20
    bridgex

    bridgex

    Convert files like docx, xlsx, pptx, html, and more to MarkDown

    Bridgex is an open‑source graphical interface for converting files to Markdown, built in Python and based on Pyside6 (Qt for Python). Its objective is to simplify access to the Markitdown library through a straightforward, modular visual experience. Features ✨ - Cross‑platform graphical interface. - Efficient file‑to‑Markdown conversion. - Modularity: easy to adapt and extend. - Support for multiple input formats. - Lightweight editing prior to saving.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 21
    dxf2gcode

    dxf2gcode

    DXF2GCODE: converting 2D dxf drawings to CNC machine compatible G-Code

    DXF2GCODE is a tool for converting 2D (dxf, pdf, ps) drawings to CNC machine compatible GCode. Windows, Linux, and Mac support by using python scripting language.
    Leader badge
    Downloads: 389 This Week
    Last Update:
    See Project
  • 22
    FileConverterX

    FileConverterX

    Convertidor de Archivos

    FileConverterX es una herramienta diseñada para convertir documentos entre distintos formatos, optimizando la gestión y manipulación de archivos. Permite a los usuarios transformar documentos de texto, imágenes y otros tipos de archivos en diferentes formatos de manera rápida y eficiente.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    MComix

    MComix

    GTK+ comic book viewer.

    MComix is a user-friendly, customizable image viewer. It is specifically designed to handle comic books (both Western comics and manga) and supports a variety of container formats (including CBR, CBZ, CB7, CBT, LHA and PDF). MComix is a fork of Comix.
    Leader badge
    Downloads: 649 This Week
    Last Update:
    See Project
  • 24
    Scribus

    Scribus

    Powerful desktop publishing software

    Scribus is an Open Source program that brings professional page layout to Linux, BSD UNIX, Solaris, OpenIndiana, GNU/Hurd, Mac OS X, OS/2 Warp 4, eComStation, and Windows desktops with a combination of press-ready output and new approaches to page design. Underneath a modern and user-friendly interface, Scribus supports professional publishing features, such as color separations, CMYK and spot colors, ICC color management, and versatile PDF creation.
    Leader badge
    Downloads: 15,041 This Week
    Last Update:
    See Project
  • 25
    EasyABC

    EasyABC

    EasyABC is an open source ABC editor

    EasyABC allows the user to create, edit, view, play, convert music written in the ABC music notation language. The program was originally written in Python 2.7 and WxPython by Nils Liberg and runs on Windows, OSX, and Linux. Jan Wybren de Jong has converted to run on Python 3.8 or higher. Frédéric Aupépin has been supporting EasyABC on OSX. EasyABC depends upon other external programs like abc2midi, abcm2ps, fluidsynth. If you install the Windows or Mac executables most of these programs...
    Leader badge
    Downloads: 287 This Week
    Last Update:
    See Project