Benchmarking synthetic data generation methods
Create HTML profiling reports from pandas DataFrame objects
The open standard for data logging
Open-source DORA metrics platform for engineering teams
A Julia package for data clustering
Video stabilization using gyroscope data
Curated list of classic, high-quality computer science books
Scalable master data management and identity resolution
Library to encode and decode images in WebP format
A high-quality tool for convert PDF to Markdown and JSON
Panda-Helper: Data profiling utility for Pandas DataFrames and Series
Easily generate information-rich, publication-quality tables from R
Great Expectations Airflow operator
Log management solution that improves the performance of SIEM
Open-source data observability for analytics engineers
Data quality assessment and metadata reporting for data frames
Synthetic data curation for post-training and data extraction
Raspberry Pi config for all things Internet
DataCap is integrated software for data transformation
Light-weight, flexible, expressive statistical data testing library
A ranked list of awesome Python open-source libraries
Dataset Management Framework, a Python library and a CLI tool to build
Flexible Photo Recrafting While Preserving Your Identity
A Gem for creating partial anonymized dumps of your database
The best JavaScript Data Table for building enterprise applications