Create HTML profiling reports from pandas DataFrame objects
Open-source data observability for analytics engineers
Dataset Management Framework, a Python library and a CLI tool to build
Python ETL framework for stream processing, real-time analytics, LLM
Monitor the stability of a Pandas or Spark dataframe
The open standard for data logging
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
Clean Jupyter notebooks of outputs, metadata, and empty cells
Scale your Pandas workflows by changing a single line of code
Making DAG construction easier
Light-weight, flexible, expressive statistical data testing library
Synthetic data generators for structured and unstructured text
The open-source tool for building high-quality datasets
Streamline your ML workflow
Efficiently diff rows across two different databases
A real-time visualisation of the CO2 emissions of electricity
Project structure for doing and sharing data science work
Easy integration with Athena, Glue, Redshift, Timestream, Neptune
airda(Air Data Agent
A Python package for interactive geospaital analysis and visualization
Build, run, and manage data pipelines for integrating data
The standard data-centric AI package for data quality and ML
Detecting silent model failure. NannyML estimates performance
AutoGluon: AutoML for Image, Text, and Tabular Data
Data science on data without acquiring a copy