Showing 257 open source projects for "data mining"

View related business solutions
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 1
    An open source framework for LC-MS based proteomics and metabolomics. OpenMS offers data structures and algorithms for the processing of mass spectrometry data. The library is written in C++. Our source code and wiki lives on GitHub (https://github.com/OpenMS/OpenMS).
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    GEOMS2

    GEOMS2

    Geostatistics and geosciences modeling software

    ...attredirects=0&d=1 http://sourceforge.net/projects/geoms2/files/Mining.7z/download
    Downloads: 19 This Week
    Last Update:
    See Project
  • 3
    xLearn

    xLearn

    High performance, easy-to-use, and scalable machine learning (ML)

    xLearn is a high-performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM), all of which can be used to solve large-scale machine learning problems. xLearn is especially useful for solving machine learning problems on large-scale sparse data. Many real-world datasets deal with high dimensional sparse feature vectors like a recommendation system where the number of categories and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    TEXT2DATA

    TEXT2DATA

    Text Analytics Platform

    Bring Text Analytics Platform that uses NLP (Natural Language Processing) and Machine Learning to your work environment. Extract essential information from your text documents and let Artificial Intelligence save your time. Get detailed and agile reports on your unstructured data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    When data mining techniques are applied to discover useful knowledge behind a large data collection, they are required to be able to preserve some confidential information, such as sensitive frequent itemsets, rules and the like. A feasible way to ensure the confidentiality is to sanitize the database and conceal sensitive information. However, the sanitization process often produces side effects, thus minimizing these side effects is an important task.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    GMOL

    GMOL

    A tool for 3D genome structure visualization

    ...It allows users to view the genome structure at multiple scales, including: global, chromosome, loci, fiber, nucleosome, and nucleotide. This software was built upon the pre-existing Jmol package by Prof. Cheng's group. The software is developed in Prof. Jianlin Cheng's Bioinformatics, Data Mining and Machine Learning Laboratory in the Computer Science Department at the University of Missouri - Columbia, USA. The project is supported by the National Science Foundation (grant no. DBI1149224). If you use GMOL in your research, please cite: Nowotny, Jackson, Avery Wells, Oluwatosin Oluwadare, Lingfei Xu, Renzhi Cao, Tuan Trieu, Chenfeng He, and Jianlin Cheng. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    InflationCoin (IFLT)

    InflationCoin (IFLT)

    Standard Proof of Stake Cryptocurrency.

    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Siamese and triplet learning

    Siamese and triplet learning

    Siamese and triplet networks with online triplet mining in PyTorch

    ...The repository demonstrates how to train these models using contrastive loss and triplet loss functions, which encourage embeddings of similar samples to be close while pushing dissimilar samples farther apart. It includes data loaders, training scripts, neural network architectures, and evaluation metrics that allow researchers to experiment with different embedding learning strategies. The project also implements online pair and triplet mining techniques to efficiently generate training examples during model training.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9

    PanoramaServer

    Open Source Panorama Server for free virtual tour of 360 degrees views

    Ideal for creating virtual tours of panoramic views for all sorts including property exhibition for brokers at real estate agencies/property agents, tour guide for indoor/outdoor venues, information to public/private facilities for curators, travel journal for tourist as log book, backdrop setting for storytelling, treasure hunt like games, big data mining for pattern through computer vision in artificial intelligence, etc. It is like creating your own Google Map Street View. All is required by the user is to have photos of equirectangular format (panorama) taken from 3D cameras common for on-site premises. These images can be referenced by the PanoramaServer to create virtual travels with 360 degrees view where viewers can navigate to different locations, view information, etc. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 10
    GPU Tracking nvidia-smi

    GPU Tracking nvidia-smi

    Open Hardware Monitor watcher realtime upload data to server

    Open Hardware Monitor watcher real-time upload data to the server.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11

    A Weather

    Free application for weather data

    A Weather data analytical application using freely available and open data mining standards including Chrome extension, Google spreadsheet, Python scripting, etc. Collects weather data from Google search result through Chrome extension to append data to Google spreadsheet. With the stored information such as time, place, temperature, humidity, etc., can find out if changes in temperature can affect the relative humidity.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    GreMod

    GreMod: Real-time Incremental Community Detection

    This is the source code of the paper: Shang, Jiaxing, Lianchen Liu, Feng Xie, Zhen Chen, Jiajia Miao, Xuelin Fang, and Cheng Wu. "A real-time detecting algorithm for tracking community structure of dynamic networks." In 6th SNA-KDD Workshop, 2012. Anyone using this code please cite this paper.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    This project aims to develop and share fast frequent subgraph mining and graph learning algorithms. Currently we release the frequent subgraph mining package FFSM and later we will include new functions for graph regression and classification package
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    BioRec:Bird Census field data annotation

    Recognizing biological data from a notebook.

    This project helps to digitize field data for a certain Bird Census method. Namely, bird census based on personal inspection or small (~10 km^2) regions with recording birds' position and behaviour on paper. This project makes it easy to annotate such field data and to make this data available for statistical analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    wgssat_nbfgr

    wgssat_nbfgr

    WGSSAT: SSR Annotation Pipeline

    WGSSAT provides a Graphical User Interface pipeline to mine and characterize SSR from Whole genome data. This pipeline integrates prediction of genes, ncRNA, repeats and SSR from whole genome assembly and mapping of these predicted SSR on to the genome (classified according to genes, ncRNA, repeats, exonic and intronic region) along with primer designing and mining of cross-species amplification markers. The mining of SSRs from whole genome provides valuable information on the abundance of SSRs in various genomic regions and will also facilitate the development of markers for genetic analysis and related applications, such as marker-assisted breeding and linkage mapping.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    CUBiBit

    Tool to search binary biclusters on CUDA-enabled GPUs

    CUBiBit is a parallel tool to accelerate the search of biclusters on binary datasets using CUDA-enabled GPUs. This data mining procedure is especially useful for gene expression data. This tool receives as input a file with ARFF extension that contais the binary values of m attributes and n samples and returns a file with the biclustering information. It is able to exploit the parallel capabilities of manycore NVIDIA GPUS. It also makes use of C++11 multithreading support to accelerate one phase of the algorithm on several CPU cores.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Bolt ML

    Bolt ML

    10x faster matrix and vector operations

    Bolt is an open-source research project focused on accelerating machine learning and data mining workloads through efficient vector compression and approximate computation techniques. The core idea behind Bolt is to compress large collections of dense numeric vectors and perform mathematical operations directly on the compressed representations instead of decompressing them first. This approach significantly reduces both memory usage and computational overhead when working with high-dimensional data commonly used in machine learning systems. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    In-Close

    In-Close

    In-Close is a fast Formal Concept miner.

    In-Close is a fast Formal Concept Miner and Tree Builder for FCA files in the cxt format and FIMI .dat files. Features include: minimum support for intent and extent, output of analysis data, output of concepts, output of reduced context using minimum support values and output of sorted cxt file (thus can convert dat to cxt format).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    MYRA

    MYRA

    A collection of ACO algorithms for the data mining classification task

    MYRA is a collection of Ant Colony Optimization (ACO) algorithms for the data mining classification task. It includes popular rule induction and decision tree induction algorithms. The algorithms are ready to be used from the command line or can be easily called from your own Java code. They are build using a modular architecture, so they can be easily extended to incorporate different procedures and/or use different parameter values.
    Leader badge
    Downloads: 7 This Week
    Last Update:
    See Project
  • 20

    SPAWNN

    SPatial Analysis With self-organizing Neural Networks

    The SPAWNN toolkit is an innovative toolkit for spatial analysis with self-organizing neural networks which is particularily useful for spatial analysis, visualization and geographical data mining. To run the toolkit, simply download and execute (double-click) the jar-file. Please cite: - Hagenauer, J., & Helbich, M. (2016). SPAWNN: A Toolkit for SPatial Analysis With Self-Organizing Neural Networks. Transactions in GIS, 20(5), 755-775. Other related publications: - Hagenauer, J. (2016). Weighted merge context for clustering and quantizing spatial data with self-organizing neural networks. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21

    PyDaMelo

    Python-compatible Data mining elementary objects

    An attempt at offering machine learning and data mining algorithms at the finest grain we are able to, easy to combine together through Python scripting to glue together the Lego-like bricks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    All future developments will be implemented in the new MATLAB toolbox SciXMiner, please visit https://sourceforge.net/projects/scixminer/ to download the newest version. The former Matlab toolbox Gait-CAD was designed for the visualization and analysis of time series and features with a special focus to data mining problems including classification, regression, and clustering.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Lattice Miner is a data mining prototype for creating, visualizing and exploring concept (Galois) lattices. It allows the generation of formal concepts and association rules.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24

    BISD

    Batch incremental SNN-DBSCAN clustering algorithm

    Incremental data mining algorithms process frequent up- dates to dynamic datasets efficiently by avoiding redundant computa- tion. Existing incremental extension to shared nearest neighbor density based clustering (SNND) algorithm cannot handle deletions to dataset and handles insertions only one point at a time. We present an incremen- tal algorithm to overcome both these bottlenecks by efficiently identify- ing affected parts of clusters while processing updates to dataset in batch mode.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    MOGEN

    The software for modeling the 3D structure of a genome using Hi-C data

    The software builds 3D models of chromosomes or genome given chromosomal contacts data such as Hi-C, TCC, 5C. MOGEN was tested on simulated datasets. Using real Hi-C data of a human cell line, MOGEN built models that are consistent with knowledge about the genome of the cell line. Detail on how to use it: https://github.com/BDM-Lab/MOGEN It is published in Bioinformatics: http://bioinformatics.oxfordjournals.org/content/32/9/1286 Bioinformatics, Data Mining, Machine Learning (BDM) Laboratory, Jianlin Cheng, PhD Department of Computer Science University of Missouri, Columbia Email: chengji@missouri.edu
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB