Showing 40 open source projects for "general purpose data analysis"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 1
    AtomAI

    AtomAI

    Deep and Machine Learning for Microscopy

    AtomAI is a Pytorch-based package for deep and machine-learning analysis of microscopy data that doesn't require any advanced knowledge of Python or machine learning. The intended audience is domain scientists with a basic understanding of how to use NumPy and Matplotlib. It was developed by Maxim Ziatdinov at Oak Ridge National Lab. The purpose of the AtomAI is to provide an environment that bridges the instrument-specific libraries and general physical analysis by enabling the seamless deployment of machine learning algorithms including deep convolutional neural networks, invariant variational autoencoders, and decomposition/unmixing techniques for image and hyperspectral data analysis. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    ROOT

    ROOT

    Analyzing, storing and visualizing big data, scientifically

    ...ROOT provides a very efficient storage system for data models, that demonstrated to scale at the Large Hadron Collider experiments: Exabytes of scientific data are written in columnar ROOT format. ROOT comes with histogramming capabilities in an arbitrary number of dimensions, curve fitting, statistical modeling, and minimization, to allow the easy setup of a data analysis system that can query and process the data interactively or in batch mode, as well as a general parallel processing framework, RDataFrame, that can considerably speed up an analysis.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 3
    NeuralNote

    NeuralNote

    Audio Plugin for Audio to MIDI transcription using deep learning

    NeuralNote is an open-source audio software tool designed to convert recorded audio into MIDI data using modern machine learning techniques. The software functions as an audio plugin that can be used inside digital audio workstations as well as a standalone application for music production and analysis. Its main purpose is to perform audio-to-MIDI transcription, allowing musicians to record a performance and automatically transform it into editable MIDI notes. ...
    Downloads: 107 This Week
    Last Update:
    See Project
  • 4
    The Julia Programming Language

    The Julia Programming Language

    High-level, high-performance dynamic language for technical computing

    Julia is a fast, open source high-performance dynamic language for technical computing. It can be used for data visualization and plotting, deep learning, machine learning, scientific computing, parallel computing and so much more. Having a high level syntax, Julia is easy to use for programmers of every level and background. Julia has more than 2,800 community-registered packages including various mathematical libraries, data manipulation tools, and packages for general purpose computing. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 5
    .NET for Apache Spark

    .NET for Apache Spark

    A free, open-source, and cross-platform big data analytics framework

    .NET for Apache Spark provides high-performance APIs for using Apache Spark from C# and F#. With these .NET APIs, you can access the most popular Dataframe and SparkSQL aspects of Apache Spark, for working with structured data, and Spark Structured Streaming, for working with streaming data. .NET for Apache Spark is compliant with .NET Standard - a formal specification of .NET APIs that are common across .NET implementations. This means you can use .NET for Apache Spark anywhere you write...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    UMAP

    UMAP

    Uniform Manifold Approximation and Projection

    ...This includes very high dimensional sparse datasets. UMAP has successfully been used directly on data with over a million dimensions. Second, UMAP scales well in the embedding dimension—it isn't just for visualization. You can use UMAP as a general-purpose dimension reduction technique as a preliminary step to other machine learning tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Kodezi Chronos

    Kodezi Chronos

    Kodezi Chronos is a debugging-first language model

    Kodezi Chronos is a research project focused on developing a specialized language model designed specifically for debugging software and understanding large code repositories. Unlike general-purpose language models that focus primarily on code generation, Chronos is built to diagnose and repair bugs by analyzing complex relationships across files within a codebase. The project introduces architectural techniques such as Adaptive Graph-Guided Retrieval, which allows the system to navigate...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    RecBole

    RecBole

    A unified, comprehensive and efficient recommendation library

    A unified, comprehensive and efficient recommendation library. We design general and extensible data structures to unify the formatting and usage of various recommendation datasets. We implement more than 100 commonly used recommendation algorithms and provide formatted copies of 28 recommendation datasets. We support a series of widely adopted evaluation protocols or settings for testing and comparing recommendation algorithms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Netflix Maestro

    Netflix Maestro

    Netflix’s Workflow Orchestrator

    Maestro is a large-scale workflow orchestration platform originally developed by Netflix to coordinate complex data processing and machine learning workflows across distributed systems. The system acts as a general-purpose workflow orchestrator that manages the execution, scheduling, monitoring, and recovery of large pipelines used for analytics and AI operations. It was designed to support the demanding internal infrastructure of Netflix, where thousands of workflows must process massive volumes of data reliably and efficiently every day. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 10
    mlr3

    mlr3

    mlr3: Machine Learning in R - next generation

    mlr3 is a modern, object-oriented R framework for machine learning. It provides core abstractions (tasks, learners, resamplings, measures, pipelines) implemented using R6 classes, enabling extensible, composable machine learning workflows. It focuses on clean design, scalability (large datasets), and integration into the wider R ecosystem via extension packages. Users can do classification, regression, survival analysis, clustering, hyperparameter tuning, benchmarking etc., often via...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Gen.jl

    Gen.jl

    A general-purpose probabilistic programming system

    An open-source stack for generative modeling and probabilistic inference. Gen’s inference library gives users building blocks for writing efficient probabilistic inference algorithms that are tailored to their models, while automating the tricky math and the low-level implementation details. Gen helps users write hybrid algorithms that combine neural networks, variational inference, sequential Monte Carlo samplers, and Markov chain Monte Carlo. Gen features an easy-to-use modeling language...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Uranie

    Uranie

    Uranie is CEA's uncertainty analysis platform, based on ROOT

    Uranie is a sensitivity and uncertainty analysis plateform based on the ROOT framework (http://root.cern.ch) . It is developed at CEA, the French Atomic Energy Commission (http://www.cea.fr). It provides various tools for: - data analysis - sampling - statistical modeling - optimisation - sensitivity analysis - uncertainty analysis - running code on high performance computers - etc. Thanks to ROOT, it is easily scriptable in CINT (c++ like syntax) and Python. Is is...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 13
    Eventer

    Eventer

    Rapid, unbiased, reproducible analysis of synaptic events

    Eventer is a programme designed for the detection of spontaneous synaptic events measured by electrophysiology or imaging. The software combines deconvolution for detection, and variable length template matching approaches for screening out false positive events. Eventer also includes a machine learning-based approach allowing users to train a model to implement their ‘expert’ selection criteria across data sets without bias. Sharing models allows users to implement consistent analysis...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Ubix Linux

    Ubix Linux

    The Pocket Datalab

    Ubix stands for Universal Business Intelligence Computing System. Ubix Linux is an open-source, Debian-based Linux distribution geared towards data acquisition, transformation, analysis and presentation. Ubix Linux purpose is to offer a tiny but versatile datalab. Ubix Linux is easily accessible, resource-efficient and completely portable on a simple USB key. Ubix Linux is a perfect toolset for learning data analysis and artificial intelligence basics on small to medium datasets. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15

    Faum

    Fast Autonomous Unsupervised Multidimiensional Classification

    This is the proof-of-concept implementation of the FAUM Clustering method. This implementation was used to perform the published results and is now released in the hope that it will be useful.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    EZStacking

    EZStacking is Jupyter notebook generator for machine learning

    EZStacking is Jupyter notebook generator for supervised learning problems using Scikit-Learn pipelines and stacked generalization. EZStacking handles classification and regression problems for structured data. It can also be viewed as a development tool, because a notebook generated with EZStacking contains: -an exploratory data analysis (EDA) used to assess data quality - a modelling producing a reduced-size stacked estimator - a server returning a prediction, a measure of the quality...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    surpriver

    surpriver

    Find big moving stocks before they move using machine learning

    surpriver is a machine learning project designed to identify unusual stock market activity that may precede large price movements. The system analyzes historical stock price and volume data to detect anomalies that could indicate potential trading opportunities. By applying machine learning techniques to market indicators, the tool attempts to identify patterns in trading behavior that deviate significantly from normal market activity. These anomalies are interpreted as signals that a stock...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Easy Machine Learning

    Easy Machine Learning

    Easy Machine Learning is a general-purpose dataflow-based system

    ...Our platform Easy Machine Learning presents a general-purpose dataflow-based system for easing the process of applying machine learning algorithms to real-world tasks. In the system, a learning task is formulated as a directed acyclic graph (DAG) in which each node represents an operation (e.g. a machine learning algorithm), and each edge represents the flow of the data from one node to its descendants.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit for All of Us

    DSTK - DataScience ToolKit is an opensource free software for statistical analysis, data visualization, text analysis, and predictive analytics. Newer version and smaller file size can be found at: https://sourceforge.net/projects/dstk3/ It is designed to be straight forward and easy to use, and familar to SPSS user. While JASP offers more statistical features, DSTK tends to be a broad solution workbench, including text analysis and predictive analytics features. Of course you may specify...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    SPAWNN

    SPatial Analysis With self-organizing Neural Networks

    The SPAWNN toolkit is an innovative toolkit for spatial analysis with self-organizing neural networks which is particularily useful for spatial analysis, visualization and geographical data mining. To run the toolkit, simply download and execute (double-click) the jar-file. Please cite: - Hagenauer, J., & Helbich, M. (2016). SPAWNN: A Toolkit for SPatial Analysis With Self-Organizing Neural Networks. Transactions in GIS, 20(5), 755-775. Other related publications: - Hagenauer, J....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    All future developments will be implemented in the new MATLAB toolbox SciXMiner, please visit https://sourceforge.net/projects/scixminer/ to download the newest version. The former Matlab toolbox Gait-CAD was designed for the visualization and analysis of time series and features with a special focus to data mining problems including classification, regression, and clustering.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Mass-based dissimilarity

    Mass-based dissimilarity

    A data dependent dissimilarity measure based on mass estimation.

    This software calculates the mass-based dissimilarity matrix for data mining algorithms relying on a distance measure. References: Overcoming Key Weaknesses of Distance-based Neighbourhood Methods using a Data Dependent Dissimilarity Measure. KDD 2016 http://dx.doi.org/10.1145/2939672.2939779 The source code, presentation slide and poster are attached under "Files". The presentation video in KDD 2016 is published on https://youtu.be/eotD_-SuEoo . Since this software is licensed...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    LAML:Linear Algebra and Machine Learning

    A stand-alone Java library for linear algebra and machine learning

    LAML is a stand-alone pure Java library for linear algebra and machine learning. The goal is to build efficient and easy-to-use linear algebra and machine learning libraries. The reason why linear algebra and machine learning are built together is that full control of the basic data structures for matrices and vectors is required to have fast implementation for machine learning methods. Additionally, LAML provides a lot of commonly used matrix functions in the same signature to MATLAB, thus...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    BAIO

    BAIO

    Bioinformatics Artificial Intelligence Order

    A smart interface of AI that will interrogate and complete your bioinformatics data analysis for you. Download and start your instance of BAIO to join the network of great bioinformatics Minds.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    Chordalysis

    Log-linear analysis (data modelling) for high-dimensional data

    ===== Project moved to https://github.com/fpetitjean/Chordalysis ===== Log-linear analysis is the statistical method used to capture multi-way relationships between variables. However, due to its exponential nature, previous approaches did not allow scale-up to more than a dozen variables. We present here Chordalysis, a log-linear analysis method for big data. Chordalysis exploits recent discoveries in graph theory by representing complex models as compositions of triangular structures,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB