statistical free download

Showing 129 open source projects for "statistical"

View related business solutions

Python Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
1

statsmodels

Statsmodels, statistical modeling and econometrics in Python

statsmodels is a Python module that provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration. An extensive list of result statistics are available for each estimator. The results are tested against existing statistical packages to ensure that they are correct. The package is released under the open source Modified BSD (3-clause) license.

Downloads: 8 This Week

Last Update: 2025-12-05
See Project
2

Book5_Essentials-Probability-Statistics

The book 5 of statistics in simplicity

Book5_Essentials-of-Probability-and-Statistics is a Visualize-ML educational volume that introduces the statistical and probabilistic concepts underpinning modern data analysis and machine learning. The repository explains topics such as distributions, sampling, inference, and uncertainty using visual demonstrations and intuitive narratives. Its teaching philosophy prioritizes conceptual clarity over heavy formalism, making statistical thinking more approachable for beginners. ...

Downloads: 0 This Week

Last Update: 2026-02-24
See Project
3

pmdarima

Statistical library designed to fill the void in Python's time series

A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

Downloads: 0 This Week

Last Update: 2025-11-17
See Project
4

Synthetic Data Generator

SDG is a specialized framework

...The system supports multiple generation methods including statistical models, generative adversarial networks, and large language model–based synthesis. It also includes a data processing module capable of handling different data types, preprocessing columns, managing missing values, and converting formats automatically before model training.

Downloads: 11 This Week

Last Update: 2026-03-06
See Project
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime
General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.

Try Free
5

spaCy

Industrial-strength Natural Language Processing (NLP)

spaCy is a library built on the very latest research for advanced Natural Language Processing (NLP) in Python and Cython. Since its inception it was designed to be used for real world applications-- for building real products and gathering real insights. It comes with pretrained statistical models and word vectors, convolutional neural network models, easy deep learning integration and so much more. spaCy is the fastest syntactic parser in the world according to independent benchmarks, with an accuracy within 1% of the best available. It's blazing fast, easy to install and comes with a simple and productive API.

Downloads: 74 This Week

Last Update: 2026-03-29
See Project
6

StatsForecast

Fast forecasting with statistical and econometric models

StatsForecast is a Python library for time-series forecasting that delivers a suite of classical statistical and econometric forecasting models optimized for high performance and scalability. It is designed not just for academic experiments but for production-level time-series forecasting, meaning it handles forecasting for many series at once, efficiently, reliably, and with minimal overhead. The library implements a broad set of models, including AutoARIMA, ETS, CES, Theta, plus a battery of benchmarking and baseline methods, giving users flexibility in selecting forecasting approaches depending on data characteristics (trend, seasonality, intermittent demand, etc.). ...

Downloads: 6 This Week

Last Update: 2025-11-26
See Project
7

PaperBanana

Extension of Google Research’s PaperBanana

PaperBanana is an open-source agentic framework designed to automatically generate publication-quality academic diagrams and statistical plots directly from text descriptions. The project focuses on helping researchers, educators, and data scientists transform conceptual descriptions of figures into structured visual outputs suitable for research papers, presentations, and technical reports. Instead of manually designing charts or diagrams using traditional visualization tools, users can describe the desired figure in natural language and allow the system to generate the visual representation automatically. ...

Downloads: 4 This Week

Last Update: 2026-03-09
See Project
8

NVIDIA Earth2Studio

Open-source deep-learning framework

...The toolkit makes it easy to run deterministic and ensemble forecasts, swap models interchangeably, and process large geophysical datasets with Xarray structures, enabling experimentation with state-of-the-art deep learning models for climate and atmospheric prediction. Users can extend Earth2Studio with optional model packs, advanced data interfaces, statistical operators, and backend integrations that support flexible workflows from simple tests to large-scale operational inference.

Downloads: 3 This Week

Last Update: 2026-03-23
See Project
9

Natural Language Toolkit

NLTK Source

The Natural Language Toolkit (NLTK) is a widely used open-source Python library designed for working with human language data and building natural language processing (NLP) applications. It provides a comprehensive suite of modules, datasets, and tutorials that support both symbolic and statistical approaches to language processing. The toolkit includes implementations of many foundational NLP algorithms and utilities, enabling developers to perform tasks such as tokenization, stemming, parsing, classification, and semantic reasoning. NLTK was originally developed to support research and teaching in computational linguistics and artificial intelligence, and it has become one of the most influential educational platforms for learning NLP in Python. ...

Downloads: 0 This Week

Last Update: 2026-03-24
See Project
Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
10

plotly.py

The interactive graphing library for Python

plotly.py is a browser-based, open source graphing library for Python that lets you create beautiful, interactive, publication-quality graphs. Built on top of plotly.js, it is a high-level, declarative charting library that ships with more than 30 chart types. Everything from statistical charts and scientific charts, through to maps, 3D graphs and animations, plotly.py lets you create them all. Graphs made with plotly.py can be viewed in Jupyter notebooks, standalone HTML files, or hosted online using Chart Studio Cloud.

Downloads: 11 This Week

Last Update: 2 days ago
See Project
11

TensorFlow Probability

Probabilistic reasoning and statistical analysis in TensorFlow

TensorFlow Probability is a library for probabilistic reasoning and statistical analysis. TensorFlow Probability (TFP) is a Python library built on TensorFlow that makes it easy to combine probabilistic models and deep learning on modern hardware (TPU, GPU). It's for data scientists, statisticians, ML researchers, and practitioners who want to encode domain knowledge to understand data and make predictions.

Downloads: 0 This Week

Last Update: 2024-11-08
See Project
12

Orange Data Mining

Orange: Interactive data analysis

Open source machine learning and data visualization. Build data analysis workflows visually, with a large, diverse toolbox. Perform simple data analysis with clever data visualization. Explore statistical distributions, box plots and scatter plots, or dive deeper with decision trees, hierarchical clustering, heatmaps, MDS and linear projections. Even your multidimensional data can become sensible in 2D, especially with clever attribute ranking and selections. Interactive data exploration for rapid qualitative analysis with clean visualizations. ...

Downloads: 48 This Week

Last Update: 2025-12-20
See Project
13

PyMC

Bayesian Modeling and Probabilistic Programming in Python

PyMC is a Python library for probabilistic programming focused on Bayesian statistical modeling and machine learning. Built on top of computational tools like Aesara and NumPy, PyMC allows users to define models using intuitive syntax and perform inference using MCMC, variational inference, and other advanced algorithms. It’s widely used in scientific research, data science, and decision modeling.

Downloads: 2 This Week

Last Update: 4 days ago
See Project
14

Potpie

Create custom engineering agents for your codebase

Potpie is an AI-powered data analysis tool that automates the exploration and visualization of datasets, assisting users in uncovering insights without extensive coding.

Downloads: 2 This Week

Last Update: 2026-02-23
See Project
15

NBA Sports Betting Machine Learning

NBA sports betting using machine learning

...Machine learning models are then trained to estimate the probability that a team will win a game as well as whether the total score will fall above or below the sportsbook’s predicted total. In addition to predicting outcomes, the project evaluates expected value to determine whether a potential bet offers a statistical advantage compared with sportsbook odds.

Downloads: 16 This Week

Last Update: 2026-03-06
See Project
16

WeasyPrint

The awesome document factory

WeasyPrint is a smart solution helping people to create PDF documents. You can generate gorgeous statistical reports, invoices, tickets, and anything you want as long as you have some webdesign skills! Design your documents just as you design your websites! WeasyPrint follows the widely used HTML and CSS specifications from the W3C. You can use your usual web tools, languages and frameworks, but for print. Creating high-quality digital documents requires features that you love to use as readers, tables of contents, links, annotations, optimized images, attachments, WeasyPrint provides many features out of the box, and even gives you the possibility to add your own ways to customize your PDF files. ...

Downloads: 24 This Week

Last Update: 2026-02-06
See Project
17

Edit Banana

Edit Banana: A framework for converting statistical figures

Edit Banana is an innovative web application designed to simplify image editing by merging intuitive user interfaces with powerful generative AI capabilities, enabling users to quickly enhance, manipulate, or transform photos without needing advanced design skills. It provides a smooth, browser-based experience where users can upload images, make precise edits such as background removal or inpainting, and apply stylistic transformations or corrections through AI prompts. The tool focuses on...

Downloads: 10 This Week

Last Update: 4 days ago
See Project
18

AutoResearchClaw

Autonomous research from idea to paper. Chat an Idea. Get a Paper 🦞

...The system retrieves real academic references from sources such as arXiv and Semantic Scholar to ensure credible citations. It can automatically generate code for experiments, run them in a sandbox environment, and analyze the results with statistical methods. The platform also uses multi-agent debate and automated peer review processes to refine research findings and improve paper quality. By combining literature discovery, experimentation, and writing automation, AutoResearchClaw aims to turn research ideas into conference-ready papers with minimal human intervention.

Downloads: 31 This Week

Last Update: 2026-04-01
See Project
19

CodeChecker

CodeChecker is an analyzer tooling, defect database

CodeChecker is a static analysis infrastructure built on the LLVM/Clang Static Analyzer toolchain, replacing scan-build in a Linux or macOS (OS X) development environment. Executes Clang-Tidy and Clang Static Analyzer with Cross-Translation Unit analysis, Statistical Analysis (when checkers are available). Creates the JSON compilation database by wiretapping any build process (e.g., CodeChecker log -b "make"). Automatically analyzes GCC cross-compiled projects: detecting GCC or Clang compiler configuration and forming the corresponding clang analyzer invocations. Incremental analysis: Only the changed files and its dependencies need to be reanalyzed. ...

Downloads: 8 This Week

Last Update: 2026-02-12
See Project
20

MiniSom

MiniSom is a minimalistic implementation of the Self Organizing Maps

MiniSom is a minimalistic and Numpy-based implementation of the Self Organizing Maps (SOM). SOM is a type of Artificial Neural Network able to convert complex, nonlinear statistical relationships between high-dimensional data items into simple geometric relationships on a low-dimensional display. Minisom is designed to allow researchers to easily build on top of it and to give students the ability to quickly grasp its details. The project initially aimed for a minimalistic implementation of the Self-Organizing Map (SOM) algorithm, focusing on simplicity in features, dependencies, and code style. ...

Downloads: 3 This Week

Last Update: 2026-01-14
See Project
21

DataProfiler

Extract schema, statistics and entities from datasets

DataProfiler is an AI-powered tool for automatic data analysis and profiling, designed to detect patterns, anomalies, and schema inconsistencies in structured and unstructured datasets. The DataProfiler is a Python library designed to make data analysis, monitoring, and sensitive data detection easy. Loading Data with a single command, the library automatically formats & loads files into a DataFrame. Profiling the Data, the library identifies the schema, statistics, entities (PII / NPI), and...

Downloads: 4 This Week

Last Update: 2025-07-30
See Project
22

ydata-profiling

Create HTML profiling reports from pandas DataFrame objects

ydata-profiling primary goal is to provide a one-line Exploratory Data Analysis (EDA) experience in a consistent and fast solution. Like pandas df.describe() function, that is so handy, ydata-profiling delivers an extended analysis of a DataFrame while allowing the data analysis to be exported in different formats such as html and json.

Downloads: 5 This Week

Last Update: 2026-01-13
See Project
23

Copulas

A library to model multivariate data using copulas

Copulas is a Python library for modeling multivariate distributions and sampling from them using copula functions. Given a table of numerical data, use Copulas to learn the distribution and generate new synthetic data following the same statistical properties. Choose from a variety of univariate distributions and copulas – including Archimedian Copulas, Gaussian Copulas and Vine Copulas. Compare real and synthetic data visually after building your model. Visualizations are available as 1D histograms, 2D scatterplots and 3D scatterplots. Access & manipulate learned parameters. ...

Downloads: 5 This Week

Last Update: 2026-02-05
See Project
24

whylogs

The open standard for data logging

...With whylogs, users are able to generate summaries of their datasets (called whylogs profiles) which they can use to track changes in their dataset Create data constraints to know whether their data looks the way it should. Quickly visualize key summary statistics about their datasets. whylogs profiles are the core of the whylogs library. They capture key statistical properties of data, such as the distribution (far beyond simple mean, median, and standard deviation measures), the number of missing values, and a wide range of configurable custom metrics. By capturing these summary statistics, we are able to accurately represent the data and enable all of the use cases described in the introduction.

Downloads: 5 This Week

Last Update: 2024-12-03
See Project
25

NeuralForecast

Scalable and user friendly neural forecasting algorithms.

...Unfortunately, available implementations and published research are yet to realize neural networks' potential. They are hard to use and continuously fail to improve over statistical methods while being computationally prohibitive. For this reason, we created NeuralForecast, a library favoring proven accurate and efficient models focusing on their usability.

Downloads: 7 This Week

Last Update: 2 days ago
See Project