Showing 257 open source projects for "data mining"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 1

    joy of text

    Editor with scripting language, security features & system interfaces.

    ...It is particularly useful for checking and cross-referencing between several source, intermediate and output files - a common requirement for CAD work. But jot's usefulness doesn't stop there. It's sophisticated search features can, for example, be used for interactive data mining or automating the extraction of numerical and textual data and reports from arrays of large text files. It's adaptable user interface, can be programmed to emulate emacs , vi UIs or mouse-driven systems - but who would want to do a thing like that? The display is highly configurable supporting popups, menus-event mouse callbacks etc. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    General Knowledge Machine Project

    General Knowledge Machine Project

    Intellect Modeling Kit: assisting research, diagnostics, consulting

    ...Intellect Modeling Kit (IMK) is intended to build knowledge machines (KM) assisting experts on the steps of activity: * Observation; * Producing propositions based on knowledge; * Elimination of impossible propositions; * Selection and verification of the most appropriate propositions; * Memorizing - new knowledge item creation; * Abstraction – building objects representing typical signs of similar objects groups, data mining. KM is not intended to replace human experts, it is built to multiply abilities. Machine should not be responsible for decisions. The IMK is designed to create ready-to-use software applications using simple text files. Any human knowledge can be uploaded to KM by expert not familiar with software coding. Demos present in kit. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    paramspider

    paramspider

    Mine parameterized URLs from web archives for security testing

    ParamSpider is an open source command-line tool designed to discover URLs that contain parameters by mining historical data from web archives such as the Wayback Machine. It helps security researchers, penetration testers, and bug bounty hunters collect potential attack surfaces by automatically gathering archived URLs related to a specific domain. Instead of returning every discovered URL, the tool intelligently filters results to highlight parameterized endpoints that are more useful for vulnerability testing. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    SentimentAnalysis-Rick&Morty

    SentimentAnalysis-Rick&Morty

    Rick & Morty Sentiment Analysis - End-of-Degree Project - UNIR

    The remarkable progress in the field of Big Data has driven the development of new technologies in natural language processing and data analysis. Text mining is a fascinating application of data analysis that extracts relevant information from related writings in different linguistic contexts. And therefore, in natural language processing, sentiment analysis and classification stands out as a key application supported by text mining.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    crawly

    crawly

    High-level web crawling and scraping framework for Elixir apps

    Crawly is a high-level application framework for crawling websites and extracting structured data using the Elixir programming language. It provides a complete environment for building web crawlers that systematically visit pages, collect information, and transform that data into structured formats for further processing. Crawly is designed for tasks such as data mining, information processing, and building historical archives of web content.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    DataMelt

    DataMelt

    Computation and Visualization environment

    DataMelt (or "DMelt") is an environment for numeric computation, data analysis, computational statistics, and data visualization. This Java multiplatform program is integrated with several scripting languages such as Jython (Python), Groovy, JRuby, BeanShell. DMelt can be used to plot functions and data in 2D and 3D, perform statistical tests, data mining, numeric computations, function minimization, linear algebra, solving systems of linear and differential equations. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 7

    BitMagic Library

    Compressed bit-sets, sparse bit matrices and algorithms

    BitMagic - C and C++ library implementing dynamic bitvectors and bit-set algorithms with several types of on-the-fly, adaptive compression. Designed for use in databases, search systems, data-mining algorithms, scientific projects. The core of the library is C++, but it provides C-compatibility wrappers and can be compiled without C++ runtime. Optimizations for Intel SSE2, SSE4.2 and AVX2.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 8
    Text Analysis Markup System
    Text Analysis Markup System (TAMS) is both a system of marking documents for qualitative analysis and a series of tools for mining information based on that syntax.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 9
    Rachota

    Rachota

    Easy-to-use personal time tracking application for Windows and Unix

    Rachota is a portable application for time-tracking your work on different projects. It runs on Windows, Linux, Solaris and Mac. Rachota displays time data in diagram form, creates customized reports or invoices and provides suggestions to increase your efficiency.
    Downloads: 15 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    Karate Club

    Karate Club

    An API Oriented Open-source Python Framework for Unsupervised Learning

    Karate Club is an unsupervised machine learning extension library for NetworkX. Karate Club consists of state-of-the-art methods to do unsupervised learning on graph-structured data. To put it simply it is a Swiss Army knife for small-scale graph mining research. First, it provides network embedding techniques at the node and graph level. Second, it includes a variety of overlapping and non-overlapping community detection methods. Implemented methods cover a wide range of network science (NetSci, Complenet), data mining (ICDM, CIKM, KDD), artificial intelligence (AAAI, IJCAI) and machine learning (NeurIPS, ICML, ICLR) conferences, workshops, and pieces from prominent journals.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Pattern

    Pattern

    Web mining module for Python, with tools for scraping

    Pattern is an open-source Python library that provides tools for web mining, natural language processing, machine learning, and network analysis. The project integrates multiple capabilities into a single framework that allows developers to collect, process, and analyze textual data from the web. It includes modules for web scraping and crawling that can retrieve information from sources such as social media platforms, search engines, and online knowledge bases.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 12

    OpenVigil

    Open pharmacovigilance data extraction, mining and analysis tool

    OpenVigil provides a webinterface to analyse pharmacovigilance data, i.e., spontanous or systematic collections of treatments (drugs) and observed adverse events ("drug side effects"). FDA Adverse Event Reporting System (AERS) and other pharmacovigilance data (e.g., Canadian or German) are supported. The OpenVigil web-based analysis tools offer several analysis modes like extraction, filtering, mining of data and analyses via measurements of disproportionality like proportional reporting ratio or reporting odds ratios) and export to spreadsheets programs like Microsoft Excel or statistics programs like R. ...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 13
    OmicSelector

    OmicSelector

    Feature selection and deep learning modeling for omic biomarker study

    OmicSelector is an environment, Docker-based web application, and R package for biomarker signature selection (feature selection) from high-throughput experiments and others. It was initially developed for miRNA-seq (small RNA, smRNA-seq; hence the name was miRNAselector), RNA-seq and qPCR, but can be applied for every problem where numeric features should be selected to counteract overfitting of the models. Using our tool, you can choose features, like miRNAs, with the most significant...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    The Lemur Project

    The Lemur Project

    Search engine and data mining applications and ClueWeb datasets.

    The Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software, including the Indri search engine in C++, the Galago search engine research framework in Java, the RankLib learning to rank library, ClueWeb09 and ClueWeb12 datasets and the Sifaka data mining application.
    Downloads: 30 This Week
    Last Update:
    See Project
  • 15

    SciXMiner

    Open-Source MATLAB toolbox for multidimensional data mining

    SciXMiner is a versatile MATLAB toolbox for the analysis of multidimensional data. It is the successor of Gait-CAD (https://sourceforge.net/projects/gait-cad/).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16

    ADaMSoft

    Open Source and data mining software

    ADaMSoft is a free and Open Source Data Mining software developed in Java. It contains data management methods and it can create ready to use reports. It can read data from several sources and it can write the results in different formats.
    Leader badge
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    Predictive Model Markup Language (PMML)
    PMML (Predictive Model Markup Language) provides a standard way to represent data mining models so that these can be shared between different statistical applications.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 18

    GMATA software for Genomic SSR marker

    Genome-wide Microsatellite Analyzing Toward Application: GMATA

    What is software GMATA v21 Genome-wide Microsatellite Analyzing Toward Application (GMATA) is a software for Simple Sequence Repeats (SSR) analyses, and SSR marker designing and mapping in any DNA sequences. It has the following functions: 1. SSR mining; 2. Statistical analysis and plotting; 3. SSR loci graphic viewing; 4. Marker designing; 5. Electronic mapping and marker transferability investigation. GMATA is accurate, sensitive and fast. It was designed to process large genomic sequence data sets, especially large whole genome sequences. In theory, genomes of any size can be analyzed by GMATA easily. ...
    Leader badge
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    DynaQ

    DynaQ

    Innovative text document search. http://dynaq.opendfki.de for details.

    The goal of DynaQ is to develop an inquiry system to explore the personal information space, supporting you with the searching paradigm 'orienteering'. DynaQ is a (desktop)search engine with enhanced functionality for file, email and blog search. Look at our GitLab homepage for sourcecode and documentation: http://dynaq.opendfki.de
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    VIKAMINE is a flexible environment for visual analytics, data mining and business intelligence - implemented in pure Java. It features several powerful visualization and mining methods, and can utilize background knowledge.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Isolation Similarity

    Isolation Similarity

    aNNE similarity based on Isolation Kernel

    Demo of using aNNE similarity for DBSCAN. Written by Xiaoyu Qin, Monash University, March 2019, version 1.0 This software is under GNU General Public License version 3.0 (GPLv3) This code is a demo of method described by the following publication: Qin, X., Ting, K.M., Zhu, Y. and Lee, V.C., 2019, July. Nearest-neighbour-induced isolation similarity and its impact on density-based clustering. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 33, pp....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    EUCoin

    EUCoin

    PreAlpha test version of a comming stable coin programmed in Pascal

    EUCoin v. 0.1.0 (Temporary name) PreAlpha test version of a comming stable coin programmed in Pascal / Lazarus / Delphi EUCoin is based on Pascal Coin version 1.4.3 The max supply of coins are generated in block 0 New user will normally have to buy coins from a dealer or an exchange. Miners don't get a direct fee for mining. Team members developers, supporters and miners will share transaction fees between the There is a build in miner. Click Allow mining to start it. It works even if the node is alone in the world. For testing all nodes get a common key pair, that have access to account 11-53 and a few empty accounts. Place the included WalletKeys.dat in the Data directory. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    FastaTools

    Performs several operations to Fasta protein databases

    ...For more information, you can have a look at the README.md file in the source code tree: https://sourceforge.net/p/lp-csic-uab/fastatools/code/ci/default/tree/README.md Or you can download the Documentation an Tutorial PDF file in the Files section: https://sourceforge.net/projects/fastatools.lp-csic-uab.p/files/FastaTools%20Documentation%20and%20Tutorials.pdf - Gallardo, Ó., Ovelleiro, D., Gay, M., Carrascal, M., & Abian, J. (2014). A collection of open source applications for mass spectrometry data mining. PROTEOMICS, 14(20), 2275–2279. https://doi.org/10.1002/pmic.201400124
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    LymPHOS2

    LymPHOS2

    LymPHOS2 Web-App

    ...Proteomics 2009, 9, 3741–3751. DOI: 10.1002/pmic.200800701 - Gallardo, Ó., Ovelleiro, D., Gay, M., Carrascal, M., Abian, J., A collection of open source applications for mass spectrometry data mining. Proteomics 2014, 20, 2275-2279. DOI: 10.1002/pmic.20140012
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB