Showing 148 open source projects for "pdf data mining"

View related business solutions
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 1
    Kimai

    Kimai

    Kimai is a web-based multi-user time-tracking application

    Kimai is an open-source time-tracking solution. It tracks work time and prints out a summary of your activities on demand. Yearly, monthly, daily, by the customer, by the project. Its simplicity is its strength. Due to Kimai’s browser-based interface, it runs cross-platform, even on your mobile device. With Kimai, the boring process of feeding Excel spreadsheets with your working hours is not only simplified, it also offers dozens of other exciting features that you don't even know you're...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    DocsGPT

    DocsGPT

    Private AI platform for agents, enterprise search and RAG pipelines

    DocsGPT is an open-source AI platform for deploying private RAG pipelines, AI agents, and enterprise search on your own infrastructure. Connect any data source (PDFs, DOCX, CSV, Excel, HTML, audio, GitHub, databases, URLs) and get accurate, hallucination-free answers with source citations. Choose your LLM: OpenAI, Anthropic, Google Gemini, or local models. Works with Qdrant, MongoDB, and Elasticsearch and more. Deploy via Docker or Kubernetes with full data sovereignty. Build...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Admidio

    Admidio

    Admidio is a free open source user management system for websites

    Admidio is a free open source user management system for websites of organizations and groups. The system has a flexible role model so that it’s possible to reflect the structure and permissions of your organization. You can create an individual profile for your members by adding or removing profile fields. Additional to these functions the system contains several modules like member lists, event manager, messages, photo album or a documents & files area. Admidio is a free online membership...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    toPDF

    Online service for PDF conversion (to PDF)

    A simple online service for PDF conversion. This project is a simple library and also a web application. It offers a REST service and a simple upload service for synchronous conversion. This library/application doesn't contain conversion libraries because it's a wrapper for existing tools. toPDF currently supports the open source tool PDF Creator (http://www.pdfforge.org) and the commercial solution, easy PDF, from BCL (http://www.pdfonline.com/easypdf/sdk/).
    Downloads: 1 This Week
    Last Update:
    See Project
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • 5

    Candid PDF Table

    CandidPDFTable – Deterministic TCPDF Table Builder

    CandidPDFTable (Candid PDF Table Builder) is a deterministic, colspan-aware table builder designed specifically for TCPDF. It provides a clean and predictable API to construct HTML tables for TCPDF::writeHTML() using explicit, cell-owned borders and late-stage layout computation. The library is built for programmatic table generation where precise control over rows, columns, colspans, borders, and serial numbering is essential. Building complex tables directly in TCPDF becomes difficult...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    JasperReports Library

    JasperReports Library

    Free Java Reporting Library

    JasperReports Library is the world's most popular open source business intelligence and reporting engine. It is entirely written in Java and it is able to use data coming from any kind of data source and produce pixel-perfect documents that can be viewed, printed or exported in a variety of document formats including HTML, PDF, Excel, OpenOffice and Word. The project is also available at: https://github.com/TIBCOSoftware/jasperreports Jaspersoft Studio is the open source report designer for the JasperReports Library. ...
    Leader badge
    Downloads: 1,325 This Week
    Last Update:
    See Project
  • 8
    FastReport Open Source

    FastReport Open Source

    Free Open Source Reporting tool for .NET

    Free Open Source Reporting tool for .NET Core/.NET Framework that helps your application generate document-like reports.
    Downloads: 62 This Week
    Last Update:
    See Project
  • 9
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...
    Downloads: 10 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 10
    AuroreNR
    Software developed for the analysis of Neutron Reflectivity data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    ant4docbook

    ant4docbook

    ANT4DOCBOOK is an ANT task for DOCBOOK

    ANT4DOCBOOK is an ANT task for DOCBOOK, a semantic markup language for technical documentation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    jPicEdt

    jPicEdt

    Another drawing editor for LaTeX with PSTricks & TikZ

    jPicEdt is an extensible internationalized vector-based drawing editor for LaTeX and related packages (TikZ, PsTricks,...), written in Java. It is also a library of reusable high-level graphic primitives.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 13
    Msc-generator

    Msc-generator

    Draws signalling charts, block diagrams and graphs from text input.

    NOTE! We have moved to https://gitlab.com/msc-generator/msc-generator All development happens there. Also, download new releases & submit issues there. A tool to draw various charts from textual descriptions. Currently, three types of charts are supported: Message Sequence Charts, generic Graphs, and Block Diagrams, with more to be added in the future. There is a command-line version for Linux and Mac (replacing mscgen), which now sports a GUI, as well. Msc-generator allows fine...
    Leader badge
    Downloads: 29 This Week
    Last Update:
    See Project
  • 14
    To give users the full control over the running application. This means that an application is working according to its purpose but the control over the whole interface is taken from developer and given to users. While an application is running, users can move, resize, and tune all the screen objects through which the communication with an application is going. Set of files includes the book (both in DOC and PDF formats), a big demonstration project with all its files available (all the...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Ada Bar Codes

    Ada Bar Codes

    Bar Code (1D or 2D) generator in pure Ada

    The project Ada Bar Codes provides a framework for generating various types of bar codes (1D, or 2D, like QR codes) on different output formats and devices. Alire crate: https://alire.ada.dev/crates/bar_codes Mirror: https://github.com/zertovitch/ada-bar-codes
    Downloads: 12 This Week
    Last Update:
    See Project
  • 16
    stkpp

    stkpp

    C++ Statistical ToolKit

    ...At a convenience, we propose the source packages on sourceforge. The library offers a dense set of (mostly) template classes in C++ and is suitable for projects ranging from small one-off projects to complete data mining application suites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    dktools - Dirk Krauses tools

    dktools - Dirk Krauses tools

    Drawing, graphics conversion, software development, administration.

    GUI and command line tools for advanced users and administrators: wxdkdraw - Minimalistic drawing application for use with LaTeX, wxd2lat - Convert wxdkdraw files to LaTeX, bitmap2pp - Convert PNG/JPEG/TIFF/NetPBM to (E)PS or PDF, fig2lat - Convert XFig files to LaTeX, htmlbook - publish HTML like a book, dkcpre - C debugging and tracing preprocessor, itadmin - manage your IT using a MySQL/MariaDB database, dk-fic - file integrity checker, dk-ls - list files, output column order is configurable, dk-cat, dk-sort, dk-lines - text tools for *x and Windows, dk-send, dk-recv - transmit data stream, dk-t2h, dk-t2l - text to HTML or LaTeX conversion.
    Leader badge
    Downloads: 11 This Week
    Last Update:
    See Project
  • 18
    Elementary Algorithms

    Elementary Algorithms

    Book of elementary algorithms and data structures

    This book introduces elementary algorithms and data structure. It includes side-by-side comparison of purely functional realization and their imperative counterpart. From 2020/12, I started re-writing this book. The PDF can be downloaded for preview (EN, 中文). The 1st edition in Chinese (中文) was published in 2017. I recently switched my focus to the Mathematics of programming, the new book is also available in (github).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    LangChain Apps on Production with Jina

    LangChain Apps on Production with Jina

    Langchain Apps on Production with Jina & FastAPI

    Jina is an open-source framework for building scalable multi-modal AI apps on Production. LangChain is another open-source framework for building applications powered by LLMs. long-chain-serve helps you deploy your LangChain apps on Jina AI Cloud in a matter of seconds. You can benefit from the scalability and serverless architecture of the cloud without sacrificing the ease and convenience of local development. And if you prefer, you can also deploy your LangChain apps on your own...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    html-pdf-chrome

    html-pdf-chrome

    HTML to PDF or image (jpeg, png, webp) converter via Chrome/Chromium

    HTML to PDF or image (jpeg, png, webp) converter via Chrome/Chromium. This library is NOT meant to accept untrusted user input. Doing so may have serious security risks such as Server-Side Request Forgery (SSRF). If you run into CORS issues, try using the --disable-web-security Chrome flag, either when you start Chrome externally, or in options.chromeFlags. This option should only be used if you fully trust the code you are executing during a print job. It is strongly recommended that you...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21

    BitMagic Library

    Compressed bit-sets, sparse bit matrices and algorithms

    BitMagic - C and C++ library implementing dynamic bitvectors and bit-set algorithms with several types of on-the-fly, adaptive compression. Designed for use in databases, search systems, data-mining algorithms, scientific projects. The core of the library is C++, but it provides C-compatibility wrappers and can be compiled without C++ runtime. Optimizations for Intel SSE2, SSE4.2 and AVX2.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    Laravel Report Generators

    Laravel Report Generators

    Rapidly Generate Simple Pdf, CSV, & Excel Report Package on Laravel

    Rapidly generate simple PDF reports on Laravel or CSV/Excel reports. This package provides simple PDF, csv & excel report generators to speed up your workflow. It also allows you to stream(), download(), or store() the report seamlessly.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 23
    mPDF

    mPDF

    PHP library generating PDF files from UTF-8 encoded HTML

    mPDF is a PHP library that generates PDF files from UTF-8 encoded HTML. The original author, Ian Back, wrote mPDF to output PDF files ‘on-the-fly’ from his website, handling different languages. It is slower than the original scripts e.g. HTML2FPDF and produces larger files when using Unicode fonts, but support for CSS styles etc. and has been much enhanced. Supports almost all languages including RTL (Arabic and Hebrew), and CJK (Chinese-Japanese-Korean). Nested block-level elements (e.g....
    Downloads: 78 This Week
    Last Update:
    See Project
  • 24
    backslide

    backslide

    CLI tool for making HTML presentations with Remark.js using Markdown

    CLI tool for making HTML presentations with Remark.js using Markdown. Use bs init to create a new presentation along with a template directory in the current directory. The template directory is needed for backslide to transform your Markdown files into HTML presentations. You can create as many markdown presentations as you want in the directory, they will all be based on the same template. Use bs serve to start a development server with live reload. A page will automatically open in your...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Swagger2Markup

    Swagger2Markup

    Swagger to AsciiDoc or Markdown converter

    The primary goal of this project is to simplify the generation of up-to-date RESTful API documentation by combining documentation that’s been hand-written with auto-generated API documentation produced by Swagger. The result is intended to be an up-to-date, easy-to-read, on- and offline user guide, comparable to GitHub’s API documentation. The output of Swagger2Markup can be used as an alternative to swagger-UI and can be served as static content. Swagger2Markup converts a Swagger JSON or...
    Downloads: 4 This Week
    Last Update:
    See Project
Auth0 Logo