Pythonic tool for running machine-learning/high performance workflows
Light-weight, flexible, expressive statistical data testing library
Production-ready data processing made easy and shareable
A more accurate representation of jupyter notebooks
Open-source metadata collector based on ODD Specification
Make your own running home page
airda(Air Data Agent
Clean Jupyter notebooks of outputs, metadata, and empty cells
Great Expectations Airflow operator
Automatically find issues in image datasets
Making DAG construction easier
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
Detecting silent model failure. NannyML estimates performance
Data science on data without acquiring a copy
Visualize and compare datasets, target values and associations
Pandas on AWS, easy integration with Athena, Glue, Redshift, etc.
Python implementation of global optimization with gaussian processes
A dedicated app for collecting thousands of POI for OpenStreetMap
Scale your Pandas workflows by changing a single line of code
re_data - fix data issues before your users & CEO would discover them
The toolkit to test, validate, and evaluate your models and surface
Code review for data in dbt
Open-source data observability for analytics engineers
The open standard for data logging
Library providing end-to-end GPU-accelerated recommender systems