Showing 52 open source projects for "pdf data mining"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 1
    ShredOS

    ShredOS

    ShredOS Disk Eraser 64 bit for all Intel 64 bit processors

    ShredOS is a lightweight, bootable Linux-based operating system designed specifically for secure disk erasure and data destruction. It enables users to permanently wipe hard drives, SSDs, and NVMe devices using the powerful nwipe utility and multiple industry-recognized wiping methods. Compatible with both BIOS and UEFI systems, ShredOS supports PCs, servers, and Intel-based Macs running on 32-bit and 64-bit processors. The platform can erase multiple drives simultaneously while generating detailed PDF certificates and logs for compliance and auditing purposes. ...
    Downloads: 465 This Week
    Last Update:
    See Project
  • 2
    Umbrel

    Umbrel

    A beautiful personal server OS for Raspberry Pi or any Linux distro

    ...They’re a part of your private life, and now they can all be stored by you, in your home, on your Umbrel. The Bitcoin network is made up of thousands of nodes that verify every single transaction in the blockchain. Some of them mine Bitcoin too, but unlike a mining node, running a non-mining node doesn’t require expensive hardware. Achieve unparalleled privacy by connecting your wallet directly to the Bitcoin node on your Umbrel.
    Downloads: 23 This Week
    Last Update:
    See Project
  • 3
    Career-Ops

    Career-Ops

    AI-powered job search system built on Claude Code

    Career Ops is an open-source platform designed to help individuals manage their job search process with a structured, operations-style approach that treats career development like a pipeline. It provides a system for organizing job applications, tracking progress across different stages, and maintaining visibility into opportunities, much like a lightweight CRM tailored for job seekers. The project emphasizes clarity and accountability, enabling users to monitor applications, follow-ups, and...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4
    Data Crow

    Data Crow

    The ultimate cataloguer

    Data Crow allows you to use the standard movie & video (divx, xvid, DVD, Blu-ray, etc), book (and eBooks), images, board games, comic books, games & software, music (mp3 and other music files) cataloguing modules. Besides these modules, which you can change to fit your requirements, you can create new modules (want to catalogue your stamps, equipment, or anything else?). The GUI is skinnable. Reporting (using JasperReports and their community edition JasperSoft Developer Studio ), loan...
    Leader badge
    Downloads: 264 This Week
    Last Update:
    See Project
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 5
    ArchiveBox

    ArchiveBox

    Open source self-hosted web archiving

    ArchiveBox is a powerful, self-hosted internet archiving solution to collect, save, and view websites offline. Without active preservation effort, everything on the internet eventually disappears or degrades. Archive.org does a great job as a centralized service, but saved URLs have to be public, and they can't save every type of content. ArchiveBox is an open source tool that lets organizations & individuals archive both public & private web content while retaining control over their data....
    Downloads: 10 This Week
    Last Update:
    See Project
  • 6
    JupyterLab

    JupyterLab

    JupyterLab computational environment

    ...Documents and activities integrate with each other, enabling new workflows for interactive computing. JupyterLab also offers a unified model for viewing and handling data formats. JupyterLab understands many file formats (images, CSV, JSON, Markdown, PDF, Vega, Vega-Lite, etc.) and can also display rich kernel output in these formats. See File and Output Formats for more information. To navigate the user interface, JupyterLab offers customizable keyboard shortcuts and the ability to use key maps from vim, emacs, and Sublime Text in the text editor.
    Downloads: 99 This Week
    Last Update:
    See Project
  • 7
    Perf Book

    Perf Book

    The book "Performance Analysis and Tuning on Modern CPU"

    This project is a practical guide to performance analysis and tuning on modern CPUs, bridging microarchitecture details with hands-on profiling. It explains how caches, TLBs, prefetchers, branch predictors, and out-of-order execution influence real program speed, then connects those concepts to concrete optimization strategies. Readers learn how to design trustworthy benchmarks, avoid measurement traps (warmup, turbo, frequency scaling), and interpret hardware performance counters. The book...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    Jina

    Jina

    Build cross-modal and multimodal applications on the cloud

    ...Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer. Build applications that deliver fresh insights from multiple data types such as text, image, audio, video, 3D mesh, PDF with Jina AI’s DocArray. Polyglot gateway that supports gRPC, Websockets, HTTP, GraphQL protocols with TLS. Intuitive design pattern for high-performance microservices. Seamless Docker container integration: sharing, exploring, sandboxing, versioning and dependency control via Jina Hub. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    pdfcrack is a command line, password recovery tool for PDF-files.
    Leader badge
    Downloads: 444 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    Drive Health Analyzer - SSD/HDD Monitor

    Drive Health Analyzer - SSD/HDD Monitor

    Monitor disk health, predict failures, track SSD/HDD SMART attributes

    Drive Health Analyzer is a comprehensive disk monitoring solution designed to prevent data loss by tracking the health status of SSDs and HDDs. The software reads SMART attributes, monitors temperature, analyzes disk performance, and predicts potential drive failures before they occur. It supports all major storage types including NVMe, SATA, and IDE drives. Features real-time alerts, detailed health reports, and automatic background monitoring. The intuitive dashboard displays critical...
    Downloads: 100 This Week
    Last Update:
    See Project
  • 11
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12
    recovery is a Live DVD/USB which aims troubleshooting, disk partitioning, system rescue, backup , restore data and desktop. This is a customized version of Debian Live. It contains : GParted, Clonezilla, Boot-Repair, LibreOffice and a lot of tools like ddrescue, Nwipe, TestDisk, DejaDup and many more recovery is modular in design, meaning programs can be installed simply by double clicking on module files. https://sourceforge.net/projects/recovery/files/modules/ version 2.5 - 31...
    Downloads: 75 This Week
    Last Update:
    See Project
  • 13
    Open Crypto Tracker

    Open Crypto Tracker

    Bitcoin Alts portfolio tracker, email / text / alexa / telegram alerts

    100% FREE / open source / PRIVATE cryptocurrency portfolio tracker. Email / text / alexa / telegram price alerts, price charts, mining calcs, leverage / gain / loss / balance stats, news feeds +more. Privately track Bitcoin / Ethereum / unlimited cryptocurrencies. Customize as many assets / markets / alerts / charts as you want. Over 50 Exchanges / 40 Trading Pairs Supported (exchanges / pairings list at bottom of README.txt): https://tinyurl.com/ct-readme Nearly Unlimited Assets...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    WP 34s

    WP 34s

    Scientific/engineering firmware repurposing HP business calculators!

    This project has created scientific firmware for the HP-20b and HP-30b business calculators. WP 34S turns either of these calculators into a powerful keystroke programmable scientific device. According to our customers, it's the most powerful and fastest RPN scientific pocket calculator ever built. WP 34S is alive and stable since 2011. We have succeeded in satisfying the most picky users - read about their experiences on http://www.hpmuseum.org/forum/forum-8.html. Since 2014, WP...
    Leader badge
    Downloads: 21 This Week
    Last Update:
    See Project
  • 15
    Free Weighbridge Software with CCTV

    Free Weighbridge Software with CCTV

    Smart Weighbridge Software with CCTV camera & WhatsApp Integration

    For any query Contact - info@eagleweigh.com or visit www.eagleweigh.com This is smart and easy to operate weighbridge software suitable for all kinds of weighbridges / Dharam Kanta. This Weighbridge software provides a secure and fraud-free operation of weighbridges with its advanced design and fraud detection features. It is a solution to the enterprises/industries that are looking for easy, transparent, and automating their weighbridge platforms. It is provided with lifetime...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    AI File Sorter

    AI File Sorter

    Local AI file organization with categorization and rename suggestions

    AI File Sorter is a cross-platform desktop application that uses AI (local LLMs run on your computer) to organize files and suggest meaningful file names based on real content, not just filenames or extensions. The app can analyze images locally and propose descriptive rename suggestions (for example, IMG_2048.jpg → clouds_over_lake.jpg). It can also analyze document text to improve categorization and renaming. Supported formats include PDF, DOCX, XLSX, PPTX, ODT, ODS, ODP, and common...
    Leader badge
    Downloads: 227 This Week
    Last Update:
    See Project
  • 17
    dktools - Dirk Krauses tools

    dktools - Dirk Krauses tools

    Drawing, graphics conversion, software development, administration.

    GUI and command line tools for advanced users and administrators: wxdkdraw - Minimalistic drawing application for use with LaTeX, wxd2lat - Convert wxdkdraw files to LaTeX, bitmap2pp - Convert PNG/JPEG/TIFF/NetPBM to (E)PS or PDF, fig2lat - Convert XFig files to LaTeX, htmlbook - publish HTML like a book, dkcpre - C debugging and tracing preprocessor, itadmin - manage your IT using a MySQL/MariaDB database, dk-fic - file integrity checker, dk-ls - list files, output column order is configurable, dk-cat, dk-sort, dk-lines - text tools for *x and Windows, dk-send, dk-recv - transmit data stream, dk-t2h, dk-t2l - text to HTML or LaTeX conversion.
    Leader badge
    Downloads: 12 This Week
    Last Update:
    See Project
  • 18
    Monitoring AIX, VMware,Oracle, Nutanix

    Monitoring AIX, VMware,Oracle, Nutanix

    AIX, Linux, VMware, Nutanix, Oracle, RHV, Cloud performance monitoring

    The tool offers you end-to-end views of your server environment and can save you significant money in operation monitoring by predicting utilization bottlenecks in your virtualized environment. You can also generate policy-based alerts, capacity reports and load forecasts. The product supports these virtualization platforms: - IBM Power Systems - VMware - Nutanix - Proxmox - Huawei FusionCompute - OracleVM - Oracle Solaris LDOM, CDOM, Zone - oVirt / RedHat Virtualization...
    Leader badge
    Downloads: 13 This Week
    Last Update:
    See Project
  • 19
    Monitoring Storage,  SAN,  LAN

    Monitoring Storage, SAN, LAN

    Storage, SAN, LAN Performance Monitoring: IBM,NetApp,Hitachi,HPE,EMC

    The tool offers you end-to-end views of your storage environment including LAN and SAN and can save you significant money in operation monitoring and by predicting utilization bottlenecks in your virtualized environment. You can also generate policy-based alerts, view overall health status of your systems, reduce service downtime, use capacity and forecasting data - Real-time storage performance visibility in LAN and SAN multi-vendor environment - Historical reporting (graph, CSV, PDF) - Alerting based on performance thresholds - Storage event monitoring The tool supports enterprise class level storage devices from major storage vendors like: IBM, Dell EMC, NetAPP, HPE, Hitachi, Lenovo, Pure Storage, Huawei, Dot Hill, INFINIDAT, Fujitsu, DataCore, Quantum, QNAP, FalconStor,Ceph, Synology, RAIDIX, Qumulo,Inspur,Veritas Monitoring of the SAN supports Brocade, Qlogic and Cisco SAN switches Monitoring of the LAN network Try demo at http://demo.stor2rrd.com
    Leader badge
    Downloads: 14 This Week
    Last Update:
    See Project
  • 20
    Common Resource Grep - crgrep

    Common Resource Grep - crgrep

    Common Resource Grep

    CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you will find binary downloads and discussion (https://sourceforge.net/p/crgrep/discussion/) . ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    Form OCR Testing Tool

    Form OCR Testing Tool

    A set of tools to use in Microsoft Azure Form Recognizer

    An open source labeling tool for Form Recognizer, part of the Form OCR Test Toolset (FOTT). This is a MAIN branch of the Tool. It contains all the newest features available. This is NOT the most stable version since this is a preview. The purpose of this repo is to allow customers to test the tools available when working with Microsoft Forms and OCR services. Currently, Labeling tool is the first tool we present here. Users could provide feedback, and make customer-specific changes to meet...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    DynaQ

    DynaQ

    Innovative text document search. http://dynaq.opendfki.de for details.

    The goal of DynaQ is to develop an inquiry system to explore the personal information space, supporting you with the searching paradigm 'orienteering'. DynaQ is a (desktop)search engine with enhanced functionality for file, email and blog search. Look at our GitLab homepage for sourcecode and documentation: http://dynaq.opendfki.de
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    sar2html
    Sar2html is web based frontend for performance monitoring. It converts sar binary data to graphical format and keep historical data in it's database. Project homepage is here: https://github.com/cemtan/sar2html.git Supported Operating Systems: HPUX 11.11, 11.23, 11,31 Solaris 5.9, 5.10, 5.11 Redhat 3, 4, 5, 6, 7 Suse 8, 9, 10, 11, 12 Ubuntu 18, 20 If you have customers facing performance problems on operating systems listed above you may send sar2ascii to collect performance data. ...
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Xena - Digital Preservation Software

    Xena - Digital Preservation Software

    Xena transforms files into open data formats

    Xena transforms files into open data formats for long-term digital preservation, encodes content in Base64 and wraps in XML metadata. Formats supported include MBOX, PST, MSG, DOC, XLS, PPT, RTF, PNG, XML, PDF, JPG, TIFF, PCX, WAV, MP3 and more. NO LONGER MAINTAINED, NO LONGER SUPPORTED
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25

    KlipMan

    A ClipBoard Manager with some pretty unique copy paste features

    For both Windows and Linux: There are many clipboard managers available on the internet, some even better than this, however, I've tried to put some very unique features in this... * Secondary Clipboard * One after other mode: copy, copy, copy, copy ; paste, paste, paste, paste mode (the contents get pasted in the same order you copied them) * CAGR calculator * Permanently save Klippings * Append mode on GitHub: https://github.com/hemanshukale/KlipMan I've also attached a PDF...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next