Showing 22 open source projects for "pdf data mining"

View related business solutions
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    KOReader

    KOReader

    An ebook reader application supporting PDF, DjVu, EPUB, FB2, etc.

    KOReader is a document viewer for E Ink devices. Supported fileformats include EPUB, PDF, DjVu, XPS, CBT, CBZ, FB2, PDB, TXT, HTML, RTF, CHM, DOC, MOBI and ZIP files. It’s available for Kindle, Kobo, PocketBook, Android and desktop Linux. Runs on embedded devices (Cervantes, Kindle, Kobo, PocketBook, reMarkable), Android and Linux computers. Developers can run a KOReader emulator in Linux and MacOS. Multi-lingual user interface with a highly customizable reader view and many typesetting...
    Downloads: 95 This Week
    Last Update:
    See Project
  • 2
    Ray Tracing in One Weekend Book Series

    Ray Tracing in One Weekend Book Series

    The Ray Tracing in One Weekend series of books

    The Ray Tracing in One Weekend series of books are now available to the public for free online. They are now released under the CC0 license. This means that they are as close to public domain as we can get. (While that also frees you from the requirement of providing attribution, it would help the overall project if you could point back to this web site as a service to other users.) These books are formatted for printing directly from your browser, where you can also (on most browsers) save...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    Kiwix

    Kiwix

    Wikipedia offline & more

    Kiwix is an offline reader for Web content. It's especially intended to make Wikipedia available offline. With Kiwix, you can enjoy Wikipedia on a boat, in the middle of nowhere... or in Jail. Kiwix manages to do that by reading ZIM files, a highly compressed open format with additional meta-data.
    Leader badge
    Downloads: 211 This Week
    Last Update:
    See Project
  • 4
    WIKINDX

    WIKINDX

    Virtual Research Environment / On-line Bibliography Manager

    Reference management, bibliography management, citations and a whole lot more. Designed by academics for academics, under continuous development since 2003, and used by both individuals and major research institutions worldwide, WIKINDX is a Virtual Research Environment (an enhanced on-line bibliography manager) storing searchable references, notes, files, citations, ideas, and more. An integrated WYSIWYG word processor exports formatted articles to RTF, DOCX, and HTML. Plugins include...
    Leader badge
    Downloads: 78 This Week
    Last Update:
    See Project
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 5
    WeBooK

    WeBooK

    Powerful Web collector & Html Editor & Ebook builder 3 in 1.

    Save unlimited webpages across websites WeBook saves webpages with an extension. Click on the link, and the webpage is saved. It can save your bookmarks. Manage and edit files with unlimited folders WeBooK creates one folder for each file imported from your local drive and convert the files into html pages. You can create unlimited folders for your files, drag and drop to change it’s position.You can search any file by keywords, edit the content, write your own content by creating an...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    TemaTres: controlled vocabulary server

    TemaTres: controlled vocabulary server

    Manage, Publish and Share Ontologies, Taxonomies, Thesauri, Glossaries

    Web application for management formal representations of knowledge, thesauri, taxonomies and multilingual vocabularies / Aplicación para la gestión de representaciones formales del conocimiento, tesauros, taxonomías, vocabularios multilingües. For the latest version of code: https://github.com/tematres/TemaTres-Vocabulary-Server
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    hubs-research-acm-chi-2021

    hubs-research-acm-chi-2021

    Supplemental code and dataset for the ACM CHI 2021 paper

    Supplemental code and dataset for the ACM CHI 2021 paper on "Proxemics and Social Interactions in an Instrumented Virtual Reality Workshop". In this research paper we instrumented Mozilla Hubs Cloud to record where participants where during the event. From there, we measured proxemic and plotted the activity along with some semi-structured interviews. Virtual environments (VEs) can create collaborative and social spaces, which are increasingly important in the face of remote work and travel...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    jpeg2pdf

    Create PDF from JPEG scans and photos

    Cross-platform command-line tool for creation of PDF documents from scans/photos of pages in JPEG (.jpg) format and the lightest weight ANSI C library to put multiple JPEG files into one PDF file. You can add handwritten comments to PDF scans (over original images) with xournal: http://xournal.sourceforge.net/ It supports graphics tablets and saves comments to PDFs as vector data.
    Leader badge
    Downloads: 29 This Week
    Last Update:
    See Project
  • 9
    Xena - Digital Preservation Software

    Xena - Digital Preservation Software

    Xena transforms files into open data formats

    Xena transforms files into open data formats for long-term digital preservation, encodes content in Base64 and wraps in XML metadata. Formats supported include MBOX, PST, MSG, DOC, XLS, PPT, RTF, PNG, XML, PDF, JPG, TIFF, PCX, WAV, MP3 and more. NO LONGER MAINTAINED, NO LONGER SUPPORTED
    Downloads: 3 This Week
    Last Update:
    See Project
  • Your monitoring isn't a stack. It's a pile. Fix that. Icon
    Your monitoring isn't a stack. It's a pile. Fix that.

    Errors, performance, logs, uptime. One install, one invoice, one UI.

    Replace Datadog, New Relic, and Sentry without adding three more dashboards.
    Free 30 days.
  • 10
    Text2MP3

    Text2MP3

    PDF/Text to MP3 - Text Processing to speech

    This project is depricated. We apologize. ---------------------------------------------------------------------------- Windows Application that strips PDF's into text and converts to speech. You can save the extracted text also into text files, Word docs, csv's and rtf format. Browse for PDF's from the web, save them and strip them. Good for students, lecturers, theses and educational purposes. Some bugs yet to fix in the coming weeks, although these do not effect the functionality...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    C++ Travel Customer Choice Model Library
    That project aims at providing a clean API, and the corresponding C++ implementation, for choosing one item among a set of travel solutions, given demand-related characteristics (e.g., Willingness-To-Pay, preferred airline, preferred cabin, etc.).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    PhaseIt

    PhaseIt

    Graphical PDF Library

    PhaseIt is a PDF database program for your literature projects. It allows to import PDFs using simple drag-and-drop and offers fast, graphical categorization of your files, as well as easy access via your favorite installed PDF reader.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    JChart2D

    JChart2D

    jchart2d is a real-time charting library written in java.

    JChart2D is a easy to use component for displaying two- dimensional traces in a coordinate system written in Java. It supports real-time (animated) charting, custom trace rendering, Multithreading, viewports, automatic scaling and labels. Former UI controls (right click context menu, file menu) have been ported to the subproject jchart2d-uimenu (https://sourceforge.net/projects/jchart2d-uimenu.jchart2d.p/) for the benefit of having no dependencies to 3rd party libraries.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Carnatic Music Typesetting
    An opensource typesetting environment for editing and publishing Carnatic music books in Indian languages. Supports Phonetic Translation of notation & lyrics and uses CFugue Runtime to automatically generate MIDI song files from the music notation.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    jCompoundMapper
    Library for fingerprinting (decomposition) of chemical compounds. It has several tweaking possibilities and exporting options for data mining toolkits.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    TML - Text Mining Library for LSA & CMM

    TML is a Java Library for LSA and extracting Concept Maps from text

    TML has moved to http://www.villalon.cl/tml.html and the code to https://github.com/villalon/tml
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Atarrabi

    Atarrabi

    A web-based workflow application for publishing environmental data

    Atarrabi is a web-based workflow application used for preparing meteorological research data for persistent identifier registration. This software will not run out-of-the-box. Please visit our web site and contact us to learn more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    PDFcat will be a platform-independent JAVA application, that helps to manage thousands of books, articles, lecture notes, music sheets in PDF format. I also want to support txt, djvu, and zipped pacgages. I will use sqlite for the sake of portability
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Facilitates data mining/natural language processing experiments to be executed on weblogs, such as classification, clustering and rating. As part of these experiments, it is possible to apply Latent Semantic Analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    It is intended to administrate Papers in a bibilography. It is possible to add entries as BibTex or form. For every entry a PDF file can be added. The entries may be searched by author, topics etc. A list of entries can be exported as BibTex or Word(RTF)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    KNN-WEKA provides a implementation of the K-nearest neighbour algorithm for Weka. Weka is a collection of machine learning algorithms for data mining tasks. For more information on Weka, see http://www.cs.waikato.ac.nz/ml/weka/.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Data Mining Platform is a platform for data mining and analysis. It contains many of the new and sophisticated methods such as kernel-based classification, two-way clustering, bayesian networks, pattern recognition for time series analysis and many other
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next