Showing 26 open source projects for "statistics"

View related business solutions
  • 8 Monitoring Tools in One APM. Install in 5 Minutes. Icon
    8 Monitoring Tools in One APM. Install in 5 Minutes.

    Errors, performance, logs, uptime, hosts, anomalies, dashboards, and check-ins. One interface.

    AppSignal works out of the box for Ruby, Elixir, Node.js, Python, and more. 30-day free trial, no credit card required.
    Start Free
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 1
    Panda-Helper

    Panda-Helper

    Panda-Helper: Data profiling utility for Pandas DataFrames and Series

    Panda-Helper is a simple data-profiling utility for Pandas DataFrames and Series. Assess data quality and usefulness with minimal effort. Quickly perform initial data exploration, so you can move on to more in-depth analysis.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    Datumaro

    Datumaro

    Dataset Management Framework, a Python library and a CLI tool to build

    ...Datumaro makes it easy to merge datasets, split them into training/validation/test subsets, filter or transform annotations, and validate annotation quality — all while preserving metadata and supporting detailed statistics. It’s especially useful when you’re dealing with heterogeneous data sources or need to prepare complex datasets for machine learning workflows, freeing you from writing custom scripts for every format conversion.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 3
    whylogs

    whylogs

    The open standard for data logging

    ...With whylogs, users are able to generate summaries of their datasets (called whylogs profiles) which they can use to track changes in their dataset Create data constraints to know whether their data looks the way it should. Quickly visualize key summary statistics about their datasets. whylogs profiles are the core of the whylogs library. They capture key statistical properties of data, such as the distribution (far beyond simple mean, median, and standard deviation measures), the number of missing values, and a wide range of configurable custom metrics. By capturing these summary statistics, we are able to accurately represent the data and enable all of the use cases described in the introduction.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 4
    pandas

    pandas

    Fast, flexible and powerful Python data analysis toolkit

    pandas is a Python data analysis library that provides high-performance, user friendly data structures and data analysis tools for the Python programming language. It enables you to carry out entire data analysis workflows in Python without having to switch to a more domain specific language. With pandas, performance, productivity and collaboration in doing data analysis in Python can significantly increase. pandas is continuously being developed to be a fundamental high-level building...
    Downloads: 144 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    Pandas Profiling

    Pandas Profiling

    Create HTML profiling reports from pandas DataFrame objects

    pandas-profiling generates profile reports from a pandas DataFrame. The pandas df.describe() function is handy yet a little basic for exploratory data analysis. pandas-profiling extends pandas DataFrame with df.profile_report(), which automatically generates a standardized univariate and multivariate report for data understanding. High correlation warnings, based on different correlation metrics (Spearman, Pearson, Kendall, Cramér’s V, Phik). Most common categories (uppercase, lowercase,...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    ydata-profiling

    ydata-profiling

    Create HTML profiling reports from pandas DataFrame objects

    ydata-profiling primary goal is to provide a one-line Exploratory Data Analysis (EDA) experience in a consistent and fast solution. Like pandas df.describe() function, that is so handy, ydata-profiling delivers an extended analysis of a DataFrame while allowing the data analysis to be exported in different formats such as html and json.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 7
    SDGym

    SDGym

    Benchmarking synthetic data generation methods

    The Synthetic Data Gym (SDGym) is a benchmarking framework for modeling and generating synthetic data. Measure performance and memory usage across different synthetic data modeling techniques – classical statistics, deep learning and more! The SDGym library integrates with the Synthetic Data Vault ecosystem. You can use any of its synthesizers, datasets or metrics for benchmarking. You also customize the process to include your own work. Select any of the publicly available datasets from the SDV project, or input your own data. Choose from any of the SDV synthesizers and baselines. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 8
    LabPlot

    LabPlot

    Data Visualization and Analysis

    LabPlot is a FREE, open source and cross-platform Data Visualization and Analysis software accessible to everyone.
    Downloads: 39 This Week
    Last Update:
    See Project
  • 9
    seaborn

    seaborn

    Statistical data visualization in Python

    Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics. Seaborn helps you explore and understand your data. Its plotting functions operate on dataframes and arrays containing whole datasets and internally perform the necessary semantic mapping and statistical aggregation to produce informative plots. Its dataset-oriented, declarative API lets you focus on what the different elements of...
    Downloads: 13 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    MCPower

    MCPower

    MCPower — simple Monte Carlo power analysis for complex models

    MCPower-GUI is a desktop application that provides a graphical interface for the MCPower Monte Carlo power analysis library. It guides users through the full workflow across three tabs: Model setup (formula input with live parsing, CSV data upload with auto-detected variable types, effect size sliders, and correlation editing), Analysis configuration (find power for a given sample size or find the minimum sample size for a target power, with multiple testing correction and scenario...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 11
    Uranie

    Uranie

    Uranie is CEA's uncertainty analysis platform, based on ROOT

    Uranie is a sensitivity and uncertainty analysis plateform based on the ROOT framework (http://root.cern.ch) . It is developed at CEA, the French Atomic Energy Commission (http://www.cea.fr). It provides various tools for: - data analysis - sampling - statistical modeling - optimisation - sensitivity analysis - uncertainty analysis - running code on high performance computers - etc. Thanks to ROOT, it is easily scriptable in CINT (c++ like syntax) and Python. Is is...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12

    DataPrep

    Python-based data preprocessing tool

    DataPrep v0.2 is a Tkinter-based GUI application/tool designed to assist users in data preprocessing, multicollinearity removal, and feature selection for a wide range of applications in Cheminformatics, Bioinformatics, Data Analysis, Feature Selection, Molecular Modeling, Machine Learning, and Quantitative-structure-property relationship (QSPR) studies. It includes functionality to load, process, and save datasets with support for different preprocessing & multicollinearity removal...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    FreeSEM

    Free and open-source desktop application designed for SEM

    ...It allows users to visually build models using a drag-and-drop interface to create path diagrams and analyze relationships between observed and latent variables. The software supports methods such as exploratory factor analysis, covariance-based SEM, partial least squares SEM, and meta-SEM, and it provides model fit statistics like CFI, TLI, RMSEA, SRMR, and chi-square to evaluate models. It also enables exporting analysis results and reports to formats like Word, Excel, CSV, and PDF, making it useful for academic research and data analysis workflows.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    PipeRider

    PipeRider

    Code review for data in dbt

    PipeRider automatically compares your data to highlight the difference in impacted downstream dbt models so you can merge your Pull Requests with confidence. PipeRider can profile your dbt models and obtain information such as basic data composition, quantiles, histograms, text length, top categories, and more. PipeRider can integrate with dbt metrics and present the time-series data of metrics in the report. PipeRider generates a static HTML report each time it runs, which can be viewed...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    PyNanoLab

    PyNanoLab

    data analysis and Visualization with matplotlib

    PyNanoLab contains a variety of tools to complete the data analysis, statistics, curve fitting, and basic machine learning application. Visualization in pynanolab is based on matplotlib. The setup tools is desinged to control and set-up all the details of the figure with a GUI.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Data Science Notes

    Data Science Notes

    Curated collection of data science learning materials

    Data Science Notes is a large, curated collection of data science learning materials, with explanations, code snippets, and structured notes across the typical end-to-end workflow. It spans foundational math and statistics through data wrangling, visualization, machine learning, and practical project organization. The content emphasizes hands-on understanding by pairing narrative notes with runnable examples, making it useful for both self-study and classroom settings. Because it aggregates topics in one place, learners can move linearly or jump into specific areas as needed during projects. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Vaex

    Vaex

    Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python

    ...We start at 100GB. Vaex is a high-performance Python library for lazy Out-of-Core data frames (similar to Pandas), to visualize and explore big tabular datasets. It calculates statistics such as mean, sum, count, standard deviation etc, on an N-dimensional grid for more than a billion (10^9) samples/rows per second. Visualization is done using histograms, density plots and 3d volume rendering, allowing interactive exploration of big data. Vaex uses memory mapping, zero memory copy policy and lazy computations for best performance (no memory wasted). ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18

    SCaVis

    Scientific Computation and Visualization Environment

    SCaVis is an environment for scientific computation, data analysis and data visualization for scientists, engineers and students. The program is fully multiplatform (100% Java) and integrated with Java and a number of scripting languages: Jython (Python), Groovy, JRuby, BeanShell. SCaVis can be used to plot functions and data in 2D and 3D, perform statistical tests, data mining, numeric computations, function minimization, linear algebra, solving systems of linear and differential...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    slycat

    Web-based data science analysis and visualization platform.

    This is Slycat - a web-based data science analysis and visualization platform, created at Sandia National Laboratories. The goal of the Slycat project is to develop processes, tools and techniques to support data science, particularly analysis of large, high-dimensional data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    allink

    allink

    Software for data analysis, image processing, simulations, solver.

    Collection of utilities based on two basics classes: Matematica and VarData. Matematica) performs math operations on vectors and matrices for smoothing, interpolation, convolution, image processing... VarData) manipulate a structure of points connected by links. Addraw) openGL engine. ElPoly) analyze mechanical properties of polymer and membrane like structures. Addyn) perform molecular dynamics and Monte Carlo simulations and has a solver for 4th oder PDE. Avvis) perform all the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    Electrophysiology & circular stats tools

    Data analysis and circular statistics with OpenElectrophy and R

    Set of tools for basic analysis of electrophysiological data. The Python classes show how to call OpenElectrophy functions and save data. The R library applies circular statistics to spike phase data and saves the best von Mises fit and the Rayleigh statistics on the disk. The wavelet coherence analysis is done in R by the package "sowas". Check the module R_coherence to see how we solved that problem. This packages may be useful for people who start using OpenElectrophy and circular statistics in R. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    This project examines techniques to model three-dimensional rigid body motion using the geometric algebra of Dual Quaternions and how such models compare to more traditional models when used in underconstrained filtering applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Data mines the voting record and other actions of Members of the UK Parliament. Extracts information from the parliament website and stores it in a database. Provides tools to analyse the information, producing statistics and tables about the MPs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    PGAF provides a framework tuned, user-specific genetic algorithms by handling I/O, UI, and parallelism. It is designed for optimizing functions that take a "very long time" to evaluate.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Graph and analyze Folding@Home team statistics over time.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB