Showing 54 open source projects for "pdf data mining"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 1
    Probability Cheatsheet

    Probability Cheatsheet

    A comprehensive 10-page probability cheatsheet

    ...It likely includes definitions of random variables, PMFs and PDFs, expectations, variance, common distributions (e.g. binomial, normal, Poisson, exponential), conditional probability, Bayes’ theorem, moment generating functions, and perhaps important inequalities (Markov, Chebyshev, Chernoff). The cheat sheet is intended as a quick reference for students, data scientists, statisticians, or anyone needing to recall core probability formulas without diving into textbooks. It may include visual diagrams (e.g. distributions’ shapes), tips or mnemonic notes, and examples of application (e.g. computing probabilities or expectations). Formats could include Markdown, PDF, or images for easy inclusion in study materials or slides.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    PDF API HTML5 Web Apps

    PDF API HTML5 Web Apps

    Mini SDK JavaScript API library PDF web apps

    A condensed library designed to web modern applications, to quickly export your content html to pdf thanks the famous library in javascript: jsPDF. And a special thanks to the project canvg and html2canvas. Project documentation: http://ulmdevice.altervista.org/pdfapihtml5/#documentation ========== Also available service for Angular 7+: http://ulmdevice.altervista.org/pdfjsapi/ Mobile Applications: http://bit.ly/1MrlgKk Opera add-on: http://bit.ly/1kkMhTa
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. The latest...
    Leader badge
    Downloads: 115 This Week
    Last Update:
    See Project
  • 4
    Incanter

    Incanter

    Clojure-based, R-like statistical computing and graphics environment

    Incanter is a Clojure-based, R-like statistical computing and visualization library running on the JVM. It integrates core numerical libraries like Parallel Colt and JFreeChart to deliver data manipulation, modeling, statistical tests, and charting in a REPL-friendly environment. Start by visiting the Incanter website for an overview, check out the documentation page for a listing of HOW-TOs and examples, and then download either an Incanter executable or a pre-built version of the latest...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 5

    JSentiWordNet

    A wrapper for the famous SentiWordNet, a resource for opinion mining

    This project aims to provide a wrapper around the SentiWrodnet, a lexical resource for opinion mining. As defined by the authors : SentiWordNet assigns to each synset of WordNet three sentiment scores: positivity, negativity, objectivity. You can find additional information about the creation of SentiWordnet here : http://nmis.isti.cnr.it/sebastiani/Publications/LREC06.pdf sentiWordnet (avilable here : https://drive.google.com/open?
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    PDFReporter

    PDFReporter

    Generating documents and reports, offline enabled and reliable.

    The library is a fork of the popular open source Jasper Reports and supports the common features provided by Jasper Reports, but offline and for mobile apps. The PDFReporter library supports iOS, Java and Android library. For your document and report design you use the PDFReporter Studio where you can visualize your data. If you want to use the library commercially please visit our official webpage.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7

    QASreport

    QASreport - is a multi-platform C ++ Qt library for building reports

    QASreport - is a multi-platform C ++ Qt library that contains a set of classes for building reports. It is a mix of designer and report generator output means. It is intended to add to the application of automation to create, save, report output. Reports templates are stored in XML format. And can be stored and loaded from a file on disk, memory, or table blob fields. The library contains built-in designer, available in run-time, with the ability to work like a normal graphic editor....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    Random Bits Forest

    RBF: a Strong Classifier/Regressor for Big Data

    We present a classification and regression algorithm called Random Bits Forest (RBF). RBF integrates neural network (for depth), boosting (for wideness) and random forest (for accuracy). It first generates and selects ~10,000 small three-layer threshold random neural networks as basis by gradient boosting scheme. These binary basis are then feed into a modified random forest algorithm to obtain predictions. In conclusion, RBF is a novel framework that performs strongly especially on data...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Kohonen neural network library is a set of classes and functions for design, train and use Kohonen network (self organizing map) which is one of AI algorithms and useful tool for data mining and discovery knowledge in data (http://knnl.sf.net).
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • 10
    JFreeChart
    JFreeChart is a free (LGPL) chart library for the Java(tm) platform. It supports bar charts, pie charts, line charts, time series charts, scatter plots, histograms, simple Gantt charts, Pareto charts, bubble plots, dials, thermometers and more. *** JFreeChart has moved to GitHub: https://github.com/jfree/jfreechart ***
    Leader badge
    Downloads: 284 This Week
    Last Update:
    See Project
  • 11

    libVMR

    VMR - machine learning library

    libVMR is a class library written in Java which implements code generator for group method of data handling - GMDH. The library is intended for users, with machine learning skills. libVMR provides an effective framework for the research and development of data mining and predictive analytics. libVMR is based on the most popular neural network model with a higher generalization ability from kernel tricks - vector machine by Reshetov (VMR).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    JChart2D

    JChart2D

    jchart2d is a real-time charting library written in java.

    JChart2D is a easy to use component for displaying two- dimensional traces in a coordinate system written in Java. It supports real-time (animated) charting, custom trace rendering, Multithreading, viewports, automatic scaling and labels. Former UI controls (right click context menu, file menu) have been ported to the subproject jchart2d-uimenu (https://sourceforge.net/projects/jchart2d-uimenu.jchart2d.p/) for the benefit of having no dependencies to 3rd party libraries.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    CMIS Input plugin for Pentaho

    CMIS Input plugin for Pentaho

    Allows querying Content Management Systems that use the CMIS.

    ...All this is possible within the Pentaho Suite, the Open Source Business Intelligence platform, which is useful to the extraction and analysis of structured and semi-structured data. With this goal (the extraction and analysis of data) has been designed and developed the CMIS Input plugin for Pentaho Data Integration (Kettle) that allows querying Content Management Systems that use the CMIS interoperability standard. The data, once extracted, can be stored and analyzed and perhaps presented in customized reports be published in various formats for the end user (PDF, Excel, etc..).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    PdfPageCounter

    C++ code to count the number pages in a given PDF file.

    This C++ library contains the 'PdfPageCount' class that performs the single task of finding the number of pages in a given PDF document. While the PdfPageCount class is very simple to use, the contained code is complex because the page count can be hidden in any number of places, quite often within compressed data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    PHP RepDesigner Pivot
    PHP class, creates visual representation of php data table, works with jquery and interactively can be changed by the user. It is also a powerfull tool for reports visualization you can also provide mpdf php lib to create pdf files! ps this is lite alpha version,developments are in progress
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Jedi

    Jedi

    Java Enhanced Data Interface - Italian Senate Project

    JEDI is a J2EE application that provides a centralized service aiming at significantly simplify the generation of data driven documents in a enterprise environment. The documents (hereafter called "JEDI documents") can have different format types: pdf, excel, rtf, plain text data streams and xml streams. A JEDI document is an instance of the so called "managed document" (i.e. the configuration data and a particular set of rules), configured by a developer into the JEDI configuration database. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    A capsule tree is a general purpose, self-balancing tree data structure for large, ordered, data-sets. It is designed to provide the same characteristics as B-trees and B+trees, but built from the ground up for in-memory usage. In other words, there are no provisions for “slow” I/O cases. The original motivation for this tree was a better backend for memory managers. However, the end result was a new sub-category of trees.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Highchart for Nagios
    * Import pnp4nagios rrd data to Highcharts. * Highcharts is a charting library written in pure JavaScript, offering intuitive, interactive charts to your web site or web application.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Graphane is a solution to generate and deliver enterprise documents (PDF, ODT, RTF, HTML). Templates documents are designed with OpenOffice Writer. Any application being able to export data in XML format can submit these data to the Graphane Server.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    Copperhead is a small and simple library providing a Swing user interface that allows one to automatically generate PDF documents from annotated objects using the iText PDF library. Copperhead is developed under GPLv3. Please download Copperhead 0.1b for iText 2 and 0.2b for iText5. Read more on http://byteality.ch/blog. Enjoy!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Simple header-only library written in C++, for lexical analysis of files in PDF format
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    RepEdit

    Project moved to https://sourceforge.net/projects/qsqlmon/

    Report library + visual editor for Qt based applications. Project moved to https://sourceforge.net/projects/qsqlmon/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    ** Guys I have built a much more powerful Fully Featured CMS system at: https://github.com/MacdonaldRobinson/FlexDotnetCMS Macs CMS is a Flat File ( XML and SQLite ) based AJAX Content Management System. It focuses mainly on the Edit In Place editing concept. It comes with a built in blog with moderation support, user manager section, roles manager section, SEO / SEF URL
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Small and simple java library for working with Jasper Reports dynamically, enabling dynamic column creation and dynamic data sets using Apache DynaBeans. Project is developed by people at small software company called Softberries www.softberries.com
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Python module and command line utility that analyzes XML output from the program pdftohtml in order to extract tables from PDF files. Outputs CSV.
    Downloads: 0 This Week
    Last Update:
    See Project