Showing 28 open source projects for "pdf data mining"

View related business solutions
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • Your monitoring isn't a stack. It's a pile. Fix that. Icon
    Your monitoring isn't a stack. It's a pile. Fix that.

    Errors, performance, logs, uptime. One install, one invoice, one UI.

    Replace Datadog, New Relic, and Sentry without adding three more dashboards.
    Free 30 days.
  • 1
    PDF4QT

    PDF4QT

    Open source PDF editor

    PDF4QT is open source PDF editor based on Qt framework. It contains a C++ library, applications for viewing/editing PDF documents, and a command line tool. PDF4QT is an open-source PDF editor for Windows/Linux. It is a modern solution for viewing/editing/rendering PDF documents, for users and developers alike. For developers, there is a C++ library and a command line tool for use in scripts. For users, there are four applications offering many features. The project is hosted on Github and...
    Downloads: 24 This Week
    Last Update:
    See Project
  • 2
    Vanilla.PDF

    Vanilla.PDF

    Cross-platform SDK for creating and modifying PDF documents

    Vanilla.PDF is a modern, high-performance, open-source C++17 SDK designed for creating, editing, signing, and analyzing PDF documents across multiple platforms. It requires no external runtime dependencies, making it lightweight and ideal for embedding into desktop applications, servers, or automation pipelines. The SDK offers full cross-platform support including Windows, Linux, macOS, and Android, with builds available for major compilers and architectures. Vanilla.PDF supports advanced...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    node-canvas

    node-canvas

    Node canvas is a Cairo backed Canvas implementation for NodeJS

    ...For API documentation, please visit Mozilla Web Canvas API. (See Compatibility Status for the current API compliance.) All utility methods and non-standard APIs are documented. When MIME data is tracked, PDF canvases can embed JPEGs directly into the output, rather than re-encoding into PNG. This can drastically reduce filesize and speed up rendering. If working with a non-PDF canvas, image data must be tracked, otherwise the output will be junk.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4
    Gerber2PDF

    Gerber2PDF

    Gerber to PDF converter

    Gerber2PDF is a command-line tool to convert Gerber files to PDF for proofing and hobbyist printing purposes. It converts multiple Gerber files at once, placing the resulting layers each on it's own page within the PDF. Each layer has a PDF bookmark for easy reference. Layers can optionally be combined onto a single page and rendered with custom colours and transparency. There is a Drill to Gerber converter available from the downloads page.
    Leader badge
    Downloads: 18 This Week
    Last Update:
    See Project
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 5
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    QtRPT

    QtRPT

    Easy-to-use print report library and designer

    QtRPT is the easy-to-use report engine written in C++ QtToolkit. It allows combining several reports in one XML file. For separately taken field, you can specify some condition depending on which this field will display in different font and background color, etc. The project consists of two parts: report library QtRPT and report designer application QtRptDesigner. Report file is a file in XML format. The report designer makes easy to create report XML file.
    Leader badge
    Downloads: 22 This Week
    Last Update:
    See Project
  • 7
    Msc-generator

    Msc-generator

    Draws signalling charts, block diagrams and graphs from text input.

    NOTE! We have moved to https://gitlab.com/msc-generator/msc-generator All development happens there. Also, download new releases & submit issues there. A tool to draw various charts from textual descriptions. Currently, three types of charts are supported: Message Sequence Charts, generic Graphs, and Block Diagrams, with more to be added in the future. There is a command-line version for Linux and Mac (replacing mscgen), which now sports a GUI, as well. Msc-generator allows fine...
    Leader badge
    Downloads: 20 This Week
    Last Update:
    See Project
  • 8
    stkpp

    stkpp

    C++ Statistical ToolKit

    ...At a convenience, we propose the source packages on sourceforge. The library offers a dense set of (mostly) template classes in C++ and is suitable for projects ranging from small one-off projects to complete data mining application suites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    BitMagic Library

    Compressed bit-sets, sparse bit matrices and algorithms

    BitMagic - C and C++ library implementing dynamic bitvectors and bit-set algorithms with several types of on-the-fly, adaptive compression. Designed for use in databases, search systems, data-mining algorithms, scientific projects. The core of the library is C++, but it provides C-compatibility wrappers and can be compiled without C++ runtime. Optimizations for Intel SSE2, SSE4.2 and AVX2.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • 10
    LimeReport

    LimeReport

    Report generator for Qt Framework

    ...Report designer included in the library allows to create fast and intuitive print form templates which can be saved in XML format and used to generate report pages. So formed pages could be send to preview, PDF file or printer. As a data source developer can use SQL database or data passed from application using QAbstractTableModel interface. Besides one can initialize variables which available as database request parameters. LimeReport goal is to provide your application with functionaly abundant and at the same time simple to use tool for a report generation to be used even by inexperienced in IT users.
    Leader badge
    Downloads: 16 This Week
    Last Update:
    See Project
  • 11

    dvisvgm

    A fast DVI to SVG converter

    The command line tool dvisvgm converts DVI, EPS, and PDF files to the XML-based SVG format.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12

    QASreport

    QASreport - is a multi-platform C ++ Qt library for building reports

    QASreport - is a multi-platform C ++ Qt library that contains a set of classes for building reports. It is a mix of designer and report generator output means. It is intended to add to the application of automation to create, save, report output. Reports templates are stored in XML format. And can be stored and loaded from a file on disk, memory, or table blob fields. The library contains built-in designer, available in run-time, with the ability to work like a normal graphic editor....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    Random Bits Forest

    RBF: a Strong Classifier/Regressor for Big Data

    We present a classification and regression algorithm called Random Bits Forest (RBF). RBF integrates neural network (for depth), boosting (for wideness) and random forest (for accuracy). It first generates and selects ~10,000 small three-layer threshold random neural networks as basis by gradient boosting scheme. These binary basis are then feed into a modified random forest algorithm to obtain predictions. In conclusion, RBF is a novel framework that performs strongly especially on data...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    Kohonen neural network library is a set of classes and functions for design, train and use Kohonen network (self organizing map) which is one of AI algorithms and useful tool for data mining and discovery knowledge in data (http://knnl.sf.net).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Tool support for creating FMC* diagrams [Block diagrams, Petri nets, Entity-Relationship diagrams (ERD)] in MS-Visio 2000 and newer. Features: stencils, consistency checking, Petri net simulation, exporter e.g. pdf, ... *Fundamental Modeling Concepts A stripped down version of the stencil set is available for TAM (Technical Architecture Modeling of SAP). This set uses UML notation and contains Block, Activity, Sequence, State, Class, and Component diagrams. It doesn't contain Simulation,...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    Mr.FSM

    Mr.FSM

    Large-Scale Frequent Subgraph Mining in MapReduce

    This is the program used in the following paper: Wenqing Lin, Xiaokui Xiao, and Gabriel Ghinita. Large-Scale Frequent Subgraph Mining in MapReduce. In Proceedings of the 30th IEEE International Conference on Data Engineering (ICDE), pages 844-855, 2014. Please cite the paper if you choose to use the program. If having any problems, please report to {wlin1 at ntu dot edu dot sg}.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    PdfPageCounter

    C++ code to count the number pages in a given PDF file.

    This C++ library contains the 'PdfPageCount' class that performs the single task of finding the number of pages in a given PDF document. While the PdfPageCount class is very simple to use, the contained code is complex because the page count can be hidden in any number of places, quite often within compressed data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Simple header-only library written in C++, for lexical analysis of files in PDF format
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    RepEdit

    Project moved to https://sourceforge.net/projects/qsqlmon/

    Report library + visual editor for Qt based applications. Project moved to https://sourceforge.net/projects/qsqlmon/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    DMTL (Data Mining Template Library) - A generic C++ based library for mining structured patterns such as sets, sequences, trees and graphs. The library provides implementation of popular frequent pattern mining algorithms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    EfficiencyGuardian extracts callgrind efficiency measures from individual CppUnit test cases to detect efficiency regresion. It includes a data mining web tool to browse historic results and TestFarm integration for unattended execution on commit.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    A loose collection of source code and libraries for mining and recovering data from all manner of obscure file formats and media
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    XMLPDF allows you to generate PDF documents painfully with a simple XML descriptive input. XMLPDF can act as a PHP module, or as a standalone binary. XML source is more or less structured as HTML page, but implements containt pagination, embedded image
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Microsoft Visual 6.0 C++ :New 4gl Windows API, Sql, Queries( Data Base : Excel,Access,Foxpro,Dbase,Text,csv),Ado, Sort, Reporting , Sms phone,Ftp server, mail/mapi,generate Pdf(acrobat) Text,office:Excel/Word, un/Compression(cabsdk),GIf anim,jpg,bmp...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Open source code for a wide range of software is now in abundance on the net. The goal of the CodeWeb project is to data mine software development experience that is inherent in this vast amount of code to enhance future development.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next