pandas-profiling generates profile reports from a pandas DataFrame. The pandas df.describe() function is handy yet a little basic for exploratory data analysis. pandas-profiling extends pandas DataFrame with df.profile_report(), which automatically generates a standardized univariate and multivariate report for data understanding. High correlation warnings, based on different correlation metrics (Spearman, Pearson, Kendall, Cramér’s V, Phik). Most common categories (uppercase, lowercase, separator), scripts (Latin, Cyrillic) and blocks (ASCII, Cyrilic). File sizes, creation dates, dimensions, indication of truncated images and existance of EXIF metadata. Mostly global details about the dataset (number of records, number of variables, overall missigness and duplicates, memory footprint). Comprehensive and automatic list of potential data quality issues (high correlation, skewness, uniformity, zeros, missing values, constant values, between others).

Features

  • Type inference
  • Quantile statistics
  • Descriptive statistics
  • Most frequent and extreme values
  • Correlations
  • File and Image analysis

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow Pandas Profiling

Pandas Profiling Web Site

Other Useful Business Software
Go from Code to Production URL in Seconds Icon
Go from Code to Production URL in Seconds

Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
Try it free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Pandas Profiling!

Additional Project Details

Programming Language

Python

Related Categories

Python HTML XHTML, Python Machine Learning Software, Python Data Analytics Tool, Python Data Quality Tool, Python LLM Inference Tool

Registered

2022-07-29