Showing 43 open source projects for "pdf data mining"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Atera - an All-in-one platform for IT management Icon
    Atera - an All-in-one platform for IT management

    Ideal for IT departments and MSPs (managed service providers)

    Your IT essentials, integrated & elevated. Take your IT management from automated to autonomous, download Atera's agent to start your free trial!
    Try Atera now
  • 1
    Orange Data Mining

    Orange Data Mining

    Orange: Interactive data analysis

    ...When teaching data mining, we like to illustrate rather than only explain.
    Downloads: 52 This Week
    Last Update:
    See Project
  • 2
    Awesome Fraud Detection Research Papers

    Awesome Fraud Detection Research Papers

    A curated list of data mining papers about fraud detection

    A curated list of data mining papers about fraud detection from several conferences.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    borb

    borb

    borb is a library for reading, creating and manipulating PDF files

    borb is a library for creating and manipulating PDF files in python. borb is a pure python library to read, write, and manipulate PDF documents. It represents a PDF document as a JSON-like data structure of nested lists, dictionaries and primitives (numbers, string, booleans, etc) This is currently a one-man project, so the focus will always be to support those use-cases that are more common in favor of those that are rare.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    MinerU

    MinerU

    A high-quality tool for convert PDF to Markdown and JSON

    MinerU is an open-source, high-quality document extraction toolkit focused on converting PDFs (and other document formats) into structured Markdown and JSON. It leverages OCR and layout analysis to preserve semantic structure and metadata, ideal for research and data science workflows.
    Downloads: 7 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 5
    Mercury

    Mercury

    Convert Python notebook to web app and share with non-technical users

    Turn Python notebooks to web applications with open-source Mercury framework. Hide code and add interactive widgets. Non-technical users can tweak widgets and execute notebook with new parameters. The core of Mercury is Open Source under AGPLv3. We provide Mercury Pro with additional features, dedicated support and friendly commercial license. Mercury is a perfect tool to convert Python notebook to interactive web application and share with non-programmers. You define interactive widgets for...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    Tarjamento de Dados Pessoais e Sigilosos

    Tarjamento de Dados Pessoais e Sigilosos

    Ferramenta de Tarjamento de Dados Pessoais e Sigilosos

    TarjaPDF v2.0 Beta — Ferramenta de Tarjamento de Dados Pessoais e Sigilosos Proteja dados sensíveis em PDFs com segurança irreversível. Interface moderna com dark mode, marcação manual (texto, linha e área livre), detecção automática de CPF, RG, e-mail, telefone, nomes próprios e endereços. Escaneamento inteligente com análise preditiva: destaca dados pessoais para revisão antes de tarjar. Detecção de nomes via heurística e base oficial, com dicionário customizável. Relatório de...
    Downloads: 24 This Week
    Last Update:
    See Project
  • 7
    PdfBooklet
    PdfBooklet is a Python Gtk application which allows to make books or booklets from existing pdf files. It can also adjust margins, rotate, scale, merge files or extract pages.
    Leader badge
    Downloads: 222 This Week
    Last Update:
    See Project
  • 8

    FreeSEM

    Free and open-source desktop application designed for SEM

    ...The software supports methods such as exploratory factor analysis, covariance-based SEM, partial least squares SEM, and meta-SEM, and it provides model fit statistics like CFI, TLI, RMSEA, SRMR, and chi-square to evaluate models. It also enables exporting analysis results and reports to formats like Word, Excel, CSV, and PDF, making it useful for academic research and data analysis workflows.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    dxf2gcode

    dxf2gcode

    DXF2GCODE: converting 2D dxf drawings to CNC machine compatible G-Code

    DXF2GCODE is a tool for converting 2D (dxf, pdf, ps) drawings to CNC machine compatible GCode. Windows, Linux, and Mac support by using python scripting language.
    Leader badge
    Downloads: 318 This Week
    Last Update:
    See Project
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 10
    PII-Blackout

    PII-Blackout

    100% offline, AI-powered PDF redaction

    ...PII Blackout automatically scans, detects, and blackouts sensitive data points across your documents in one click. Absolute, Irreversible Security (Image-Level Blackout) Unlike standard PDF editors that merely place a black shape over editable text (which can easily be copied or uncovered), PII Blackout flattens and bakes the redaction directly into the image surface of the document. The covered data is permanently destroyed and mathematically impossible to recover. 100% Offline & Local Processing
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    AlbumForge

    AlbumForge

    Revolutionary photo album software with social impact program

    AlbumForge is revolutionary photo album creation software combining cutting-edge technology with authentic social impact. Create stunning high-resolution PDF albums entirely offline with zero cloud dependency. KEY FEATURES: - World's first geographic storytelling with cinematic fly animations - 100+ professional templates with AI-powered layouts - High-resolution PDF export (600 DPI) and MP4 video export - 100% offline operation - no cloud dependency or tracking - 50+ native...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    pdf combiner merger converter splitter

    pdf combiner merger converter splitter

    PDF Combiner is a user-friendly, GUI-based tool built in

    PDF Combiner is a user-friendly open source free to use, GUI-based tool for combining, pdf to excel, pdf to word, image to pdf, zip, unzip annotate and splitting PDF files. It is easy to use, supports multiple file insert and delete and process, and allows you to adjust the order of files before combining.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    OmicSelector

    OmicSelector

    Feature selection and deep learning modeling for omic biomarker study

    OmicSelector is an environment, Docker-based web application, and R package for biomarker signature selection (feature selection) from high-throughput experiments and others. It was initially developed for miRNA-seq (small RNA, smRNA-seq; hence the name was miRNAselector), RNA-seq and qPCR, but can be applied for every problem where numeric features should be selected to counteract overfitting of the models. Using our tool, you can choose features, like miRNAs, with the most significant...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    QuickPlot

    QuickPlot

    Simple user interface for gnuplot aimed for reflectometry data

    Graphical user interface for gnuplot to create publication quality figure very quickly. It supports templates for fast formatting of graphics, different plot styles, insets, axis and label options. One important feature is storing metadata in png and pdf files that can be used to reload any graph saved with QuickPlot.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    pdf-editor

    pdf-editor

    Edit your PDFs without needing a subscription or creating accounts

    Edit your PDFs without needing a subscription or creating accounts. Add a GUI/Turn it into a web application. Add a parser for the command line to do multiple commands at once e.g. merge (cut pdf1) pdf2. Tested working with Python 3.8.5. Install venv (py -3.8 -m pip install virtualenv). PDF and Word documents are binary files, which makes them much more complex than plaintext files. In addition to text, they store lots of font, color, and layout information. If you want your programs to read...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    QtiPlot
    QtiPlot is a user-friendly, platform independent data analysis and visualization application similar to the non-free Windows program Origin.
    Downloads: 51 This Week
    Last Update:
    See Project
  • 17
    GEOMS2

    GEOMS2

    Geostatistics and geosciences modeling software

    ...attredirects=0&d=1 http://sourceforge.net/projects/geoms2/files/Mining.7z/download
    Leader badge
    Downloads: 7 This Week
    Last Update:
    See Project
  • 18
    PDF Merge and Edit

    PDF Merge and Edit

    Python script to merge and edit sensitive PDF files

    Python script to merge and edit sensitive PDF files you don't want to upload to random sites you find on Google. Merge PDFs by adding one to another. Update a single page in a PDF (good for adding a signed page to a form) Insert a page into an existing PDF. Delete a page. Click on one of the buttons and a new window will pop up depending on the function. Pick your files and enter in the data.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    PDF-Shuffler
    PDF-Shuffler is a small python-gtk application, which helps the user to merge or split pdf documents and rotate, crop and rearrange their pages using an interactive and intuitive graphical interface. It is a frontend for python-pyPdf.
    Downloads: 35 This Week
    Last Update:
    See Project
  • 20
    PyX is a Python package for the creation of EPS, PS, PDF and SVG files. It combines an abstraction of the PostScript drawing model with a TeX/LaTeX interface. Complex tasks like 2d and 3d plots in publication-ready quality are built out of these primitives.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    i-Map - Plot Geolocation from Images

    i-Map - Plot Geolocation from Images

    Automatically plots latitude, longitude from images on Google maps.

    ...After loading images, with a single click, iMap plots all the images on World Map to visually check where they have been captured, generate timeline and activity of suspect and match them with CDR (Call Detail Record) Details. To generate a report, you can export this data into PDF or Excel file according to your requirements.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    PyDaMelo

    Python-compatible Data mining elementary objects

    An attempt at offering machine learning and data mining algorithms at the finest grain we are able to, easy to combine together through Python scripting to glue together the Lego-like bricks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    matplotlib
    Matplotlib is a python library for making publication quality plots using a syntax familiar to MATLAB users. Matplotlib uses numpy for numerics. Output formats include PDF, Postscript, SVG, and PNG, as well as screen display. As of matplotlib version 1.5, we are no longer making file releases available on SourceForge. Please visit http://matplotlib.org/users/installing.html for help obtaining matplotlib.
    Leader badge
    Downloads: 123 This Week
    Last Update:
    See Project
  • 24
    colorview2d

    colorview2d

    Extendible 2D color plotting tool.

    Visualize and analyze 3D data files using 2D colorplots. Inspired by spyview. Written in python.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    QAL

    QAL

    Query Abstraction Layer

    Project has moved to: https://github.com/OptimalBPM/qal QAL is a collection of libraries for mining, transforming and writing data from and to a number of places. Sources and destinations include different SQL and NoSQL backends, file formats like .csv, XML and excel. Even untidy HTML web pages. It has a database abstraction layer that supports connectivity to Postgres, MySQL, DB2, Oracle, MS SQL server. JSON and MongoDB is coming.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next