Showing 120 open source projects for "data analysis"

View related business solutions
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 1
    Spark Python Notebooks

    Spark Python Notebooks

    Apache Spark & Python (pySpark) tutorials for Big Data Analysis

    Spark Python Notebooks is a curated collection of example Jupyter notebooks designed to help developers and data engineers learn Apache Spark using Python in an interactive environment. Rather than only providing static code files, this project uses notebooks to teach practical data processing workflows, exposing users to real Spark programming patterns like working with RDDs, DataFrames, and distributed computations. These notebooks often demonstrate how to transform, analyze, and visualize...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Mass-based dissimilarity

    Mass-based dissimilarity

    A data dependent dissimilarity measure based on mass estimation.

    This software calculates the mass-based dissimilarity matrix for data mining algorithms relying on a distance measure. References: Overcoming Key Weaknesses of Distance-based Neighbourhood Methods using a Data Dependent Dissimilarity Measure. KDD 2016 http://dx.doi.org/10.1145/2939672.2939779 The source code, presentation slide and poster are attached under "Files". The presentation video in KDD 2016 is published on https://youtu.be/eotD_-SuEoo . Since this software is licensed...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3

    Accelerated Feature Extraction Tool

    A fast GPU accelerated feature extraction software for speech analysis

    A fast feature extraction software tool for speech analysis and processing. It incorporates standard MFCC, PLP, and TRAPS features. The tool is a specially designed to process very large audio data sets. It uses GPU acceleration if compatible GPU available (CUDA as weel as OpenCL, NVIDIA, AMD, and Intel GPUs are supported). CPU SSE intrinsic instruction set is used in cases where no compatible GPU present.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    Chordalysis

    Log-linear analysis (data modelling) for high-dimensional data

    ===== Project moved to https://github.com/fpetitjean/Chordalysis ===== Log-linear analysis is the statistical method used to capture multi-way relationships between variables. However, due to its exponential nature, previous approaches did not allow scale-up to more than a dozen variables. We present here Chordalysis, a log-linear analysis method for big data. Chordalysis exploits recent discoveries in graph theory by representing complex models as compositions of triangular structures, also known as chordal graphs. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 5
    MODLEM

    MODLEM

    rule-based, WEKA compatible, Machine Learning algorithm

    This project is a WEKA (Waikato Environment for Knowledge Analysis) compatible implementation of MODLEM - a Machine Learning algorithm which induces minimum set of rules. These rules can be adopted as a classifier (in terms of ML). It is a sequential covering algorithm, which was invented to cope with numeric data without discretization. Actually the nominal and numeric attributes are treated in the same way: attribute's space is being searched to find the best rule condition during rule induction. ...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 6
    neural network designer

    neural network designer

    a dbms for neural nets. Chatbots, DTrees, random forests, n-grams,...

    ...Do natural language processing, image or data analysis & interpretation,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Matlab Community Detection Toolbox

    Matlab Community Detection Toolbox

    CDTB is a MATLAB toolbox which performs Community Detection

    We present the Community Detection Toolbox (CDTB), a MATLAB toolbox which can be used to perform community detection. The CDTB contains several functions from the following categories. 1. graph generators; 2. clustering algorithms; 2. cluster number selection functions; 4. clustering evaluation functions. Furthermore, CDTB is designed in a parametric manner so that the user can add his own functions and extensions. The CDTB can be used in at least three ways. The user can employ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    DocCO

    DocCO

    Non-disjoint groupping of Documents based on word sequence approach

    This is a GUI for learning non disjoint groups of documents based on Weka machine learning framework. It offers the possibility to make non disjoint clustering of documents using both vectorial and sequential representation (word sequence approach based on WSK kernel). All data format supported by WEKA could be used in DocCO. Data could be loaded from files, from databases or from specified URL. All the preprocessing techniques implemented in WEKA could be used before performing the learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    feed4weka is an open library that enriches weka (http://www.cs.waikato.ac.nz/ml/weka/), an open source project for data analysis. It integrates new classification and clustering algorithms, and adds the coclustering and outlier detection frameworks
    Downloads: 0 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    This is a Matlab software package for single molecule FRET data analysis.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11

    AdPreqFr4SL

    Adaptive Prequential Learning Framework

    The AdPreqFr4SL learning framework for Bayesian Network Classifiers is designed to handle the cost / performance trade-off and cope with concept drift. Our strategy for incorporating new data is based on bias management and gradual adaptation. Starting with the simple Naive Bayes, we scale up the complexity by gradually updating attributes and structure. Since updating the structure is a costly task, we use new data to primarily adapt the parameters and only if this is really necessary, do we...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    CRFSharp

    CRFSharp

    CRFSharp is a .NET(C#) implementation of Conditional Random Field

    CRFSharp(aka CRF#) is a .NET(C#) implementation of Conditional Random Fields, an machine learning algorithm for learning from labeled sequences of examples. It is widely used in Natural Language Process (NLP) tasks, for example: word breaker, postagging, named entity recognized, query chunking and so on. CRF#'s mainly algorithm is the same as CRF++ written by Taku Kudo. It encodes model parameters by L-BFGS. Moreover, it has many significant improvement than CRF++, such as totally...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    RapidMiner Feature Selection Extension
    This RapidMiner-plugin consists of operators for feature selection and classification - mainly on high-dimensional (microarray-) data - and some helper-classes/operators.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    BIL++
    BIL++ is a set of standalone C++ packages for data processing in Bioinformatics (Graph mining, Bayesian networks, Genetic algorithm, Discretization, Gene expression data analysis, Hypothesis testing).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    A Machine Learning and Data Retrieval Framework
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Content Addressable Memory, Multi-Variate Statistics, Data Mining Includes analyzing datasets, extracting patterns, creating empirical expert system. Computes joint probabilities and implements a "belief" as the solution of an equilibrium equation
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Program to performing the complete cycle of neural networks analysis: preparing data, choosing neural network (CasCor, MP, LogRegression, PNN), learning of network, monitoring learning state, ROC-analysis, optimization of network parameters using GA.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    A Python function library to extract EEG feature from EEG time series in standard Python and numpy data structure. Features include classical spectral analysis, entropies, fractal dimensions, DFA, inter-channel synchrony and order, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Blunder is an automated tool for analyzing chained exceptions in Java. It's usefull for classify, generate a customized error message and a list for possible solutions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    Cinefile

    A category-based approach to exploring film data.

    ...It allows the user to identify abstract categories of films by providing examples of category members, learns to classify films as belonging or not belonging to those categories, and provides a graphical interface for exploring and comparing categories. Cinefile is designed to work with data retrieved from the Internet Movie Database (imdb.com). This data is used for classification and is the subject of the category-based analysis. Cinefile was developed by the University of Mary Washington's Computer Science department (http://cas.umw.edu/computerscience).
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo