Showing 257 open source projects for "data mining"

View related business solutions
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1

    Random Bits Forest

    RBF: a Strong Classifier/Regressor for Big Data

    We present a classification and regression algorithm called Random Bits Forest (RBF). RBF integrates neural network (for depth), boosting (for wideness) and random forest (for accuracy). It first generates and selects ~10,000 small three-layer threshold random neural networks as basis by gradient boosting scheme. These binary basis are then feed into a modified random forest algorithm to obtain predictions. In conclusion, RBF is a novel framework that performs strongly especially on data...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    GFP- GAKNN
    GAKNN is a data mining software for gene annotation data. GAKNN is built with k- Nearest Neighbour algorithm optimized by the genetic algorithm. Gene annotation datasets saved under .csv or .arff formats with Gene Ontology or FunCat categorization can use GAKNN to predict gene functions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3

    LorDG

    3D genome reconstruction with Lorentzian objective function

    LorDG is a software to build 3D chromosomes and genome model from chromosomal contact data such as Hi-C, TCC, 5C. Detail on how to use it: https://github.com/BDM-Lab/Lordg It is published in Nucleic Acids Research : http://nar.oxfordjournals.org/content/early/2016/11/29/nar.gkw1155.long Bioinformatics, Data Mining, Machine Learning (BDM) Laboratory, Jianlin Cheng, PhD Department of Computer Science University of Missouri, Columbia Email: chengji@missouri.edu
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    EpiMINE

    program for mining epigenomic data

    EpiMINE is a program designed for mining epigenomic data. This application performs genome wide based quantitative and correlative analysis between different annotated or raw chip-seq like datasets in the regions of interest (RI). Apart from this the tool takes the advantage in linking results with expression data. It is available both in graphical user interface and in command line form.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 5
    Kohonen neural network library is a set of classes and functions for design, train and use Kohonen network (self organizing map) which is one of AI algorithms and useful tool for data mining and discovery knowledge in data (http://knnl.sf.net).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Molecular Simulation Grid

    Molecular Simulation Grid

    Provides high performance computing power and state of the art tools

    MoSGrid focuses on the configuration and provision of Grid services for molecular simulations and annotation of the results with metadata and their provision for data mining and knowledge generation. It is based on Liferay technology togethe with gUSE.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    iCubing

    Several OLAP algorithms, data structures and HPC OLAP versions

    OLAP technology is very useful for decision makers and data mining tools with BIG data. In this direction, we implement iCubing project with several multidimensional data cube approaches for cube indexing, querying, updating and mining. There are also several cube types, i.e. alphanumeric cubes, text cubes with unstructured data and geo cube with geo types, dimensions, measures and hierarchies, so the OLAP area continues a hard challenge after more than 20 years of the seminal paper of Jim Gray et al. in 1997. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    GUI Ant-Miner is a tool for extracting classification rules from data. It is an updated version of a data mining algorithm called Ant-Miner (Ant Colony-based Data Miner).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    JaCa-DDM

    JaCa-DDM

    DDM powered by the JaCa model

    JaCa-DDM is a novel agent-based Distributed Data Mining system founded on the Agents & Artifacts paradigm, conceived to design, implement, deploy, and evaluate distributed learning strategies.
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 10

    DaMold

    a data mining platform for variant annotation and visualization.

    DaMold a powerful, integrated, web-based, and user-friendly tool to filter, annotate, cross-link, and visualize NGS, Sanger, and hotspot variants. It is easy-to-use software, which provides flexible input options and accepts variants in VCF and BED formats. For each variant DaMold predicts the variant effect, such as codon change, and amino acid change. Furthermore, it cross-links each variant with more than 30 clinically relevant public databases, which contain already reported SNPs and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Mass-based dissimilarity

    Mass-based dissimilarity

    A data dependent dissimilarity measure based on mass estimation.

    This software calculates the mass-based dissimilarity matrix for data mining algorithms relying on a distance measure. References: Overcoming Key Weaknesses of Distance-based Neighbourhood Methods using a Data Dependent Dissimilarity Measure. KDD 2016 http://dx.doi.org/10.1145/2939672.2939779 The source code, presentation slide and poster are attached under "Files". The presentation video in KDD 2016 is published on https://youtu.be/eotD_-SuEoo .
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    RAFTS3

    RAFTS3

    Rapid Alignment Free Tool for Sequences Similarity Search

    ...RAFTS3 performed searches many times faster than those with BLASTp against large protein databases such as NR and Pfam, with a small loss of sensitivity depending on the similarity degree of the sequences. RAFTS3 is a new alternative for fast comparison of protein sequences, genome annotation and biological data mining. Preprint: http://dx.doi.org/10.1101/055269 Precomputed databases evaluated on paper are available for download at http://www.bioinfo.ufpr.br/software/rafts3
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    smartsbc

    Smart Session Border Controller

    SmartSBC (c) is a general purpose security framework for Session Border Controlling (SBC) keeping in mind two general considerations: a) be smart, to be a step ahead firewalling techniques and to introduce to dynamic adaptation with complex and variable threat heuristics policies. b) be pluguble, to allow more protection strategies to be available and focused on random threats. 2. Design SmartSBC (c) is a multithreaded service which contains serverla embeded and pluguble...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    QAL

    QAL

    Query Abstraction Layer

    Project has moved to: https://github.com/OptimalBPM/qal QAL is a collection of libraries for mining, transforming and writing data from and to a number of places. Sources and destinations include different SQL and NoSQL backends, file formats like .csv, XML and excel. Even untidy HTML web pages. It has a database abstraction layer that supports connectivity to Postgres, MySQL, DB2, Oracle, MS SQL server. JSON and MongoDB is coming.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    ankus

    ankus

    Data Mining and Machine Learning Algorithms based on MapReduce

    [The feature of ankus] * ankus is a 'web-based big data mining project and tool'. - MapReduce-based data mining/machine learning algorithms library - Hadoop-based distributed bigdata system - offering a web-based GUI for easy use [The ankus project & License] * The ankus project consists of three as an open source. * ankus has Dual licensed under the community and commercial licenses
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    ExAM-Exome_Analysis_And_Mining

    A whole exome sequencing analysis package and its graphical interface

    During the past few years, whole exome sequencing has imposed itself for genetic research, largely due to its use for detection of causative mutations responsible for Mendelian disorders. As a consequence of their power and of the rapidly decreasing cost of these technologies, massive amount of exome sequencing data are generated and becoming available to a broadening community of scientists. However, these data remain difficult to analyze and interpret by the general scientific community,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    The Java Data Mining Package (JDMP) is a library that provides methods for analyzing data with the help of machine learning algorithms (e.g. clustering, classification, graphical models, neural networks, Bayesian networks, text processing, optimization).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    This site contains four packages of Mass and mass-based density estimation. 1. The first package is about the basic mass estimation (including one-dimensional mass estimation and Half-Space Tree based multi-dimensional mass estimation). This packages contains the necessary codes to run on MATLAB. 2. The second package includes source and object files of DEMass-DBSCAN to be used with the WEKA system. 3. The third package DEMassBayes includes the source and object files of a Bayesian...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    ExSTraCS

    ExSTraCS

    Extended Supervised Tracking and Classifying System

    This advanced machine learning algorithm is a Michigan-style learning classifier system (LCS) developed to specialize in classification, prediction, data mining, and knowledge discovery tasks. Michigan-style LCS algorithms constitute a unique class of algorithms that distribute learned patterns over a collaborative population of of individually interpretable IF:THEN rules, allowing them to flexibly and effectively describe complex and diverse problem spaces. ExSTraCS was primarily developed to address problems in epidemiological data mining to identify complex patterns relating predictive attributes in noisy datasets to disease phenotypes of interest. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Web Application Protection

    Web Application Protection

    Tool to detect and correct vulnerabilities in PHP web applications

    ...WAP detects the following vulnerabilities: - SQL injection using MySQL, PostgreSQL and DB2 DBMS - Reflected cross-site scripting (XSS) - Stored XSS - Remote file inclusion - Local file inclusion - Directory traversal - Source code disclosure - OS command injection - PHP code injection WAP is a static analysis tool that performs taint analysis to detect vulnerabilities, tracking malicious users inputs and checking if they reach calls of sensitive functions. It has a low rate of false positives because has implemented a data mining module to predict false positives when detects vulnerabilities. The output of the tool is: - shows the vulnerabilities found and how they are corrected - new files with the corrections
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21

    Copy Number Explorer

    Interactive Copy Number Analysis for Cancer Genomics

    Deployed online at: https://arraycgh.shinyapps.io/Copy_Number_Explorer/ https://arraycgh.shinyapps.io/Copy_Number_Explorer_Survival/ Copy Number Explorer is a data mining tool for cancer researchers interested in the structural and copy number changes. Huge volumes of genomics data from nearly every cancer type are now freely available and several online databases have begun to collate and store this information. However, current tools focus on individual gene queries rather than chromosomal and region-based queries more relevant to some researchers. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    PROPER is a package for visual evaluation of ranking classifiers for biological big data mining studies in the mathematical language MATLAB. It is an efficient tool for optimization and comparison of the state-of-the-art ranking classifiers by generating over 20 different high quality two- and three-dimensional performance curves.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    MashPalsP2p

    P2P Social Networking application under Linux

    ...Online websites providing social networking services are very popular but people don’t normally come across their limitations, which includes privacy concern, requirement of internet connectivity and unauthorized data mining on user data. MashPalsP2p is kick-starter project build around the idea of overcoming these problems while building a efficient system for providing social networking services. Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or work loads between peers. Peers are equally privileged, equipment participants in the application. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Ponte
    ...The trials will be designed and planned through a flexible authoring tool, enabling semantic interoperability of clinical care information systems with clinical research information systems and drug and disease knowledge databases, as well as the appliance of advanced data mining techniques and enhanced learning algorithms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    C++ Travel Customer Choice Model Library
    That project aims at providing a clean API, and the corresponding C++ implementation, for choosing one item among a set of travel solutions, given demand-related characteristics (e.g., Willingness-To-Pay, preferred airline, preferred cabin, etc.).
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB