Streamline your ML workflow
Docker image used to run data processing workloads
Automatically find issues in image datasets
Pandas on AWS, easy integration with Athena, Glue, Redshift, etc.
The toolkit to test, validate, and evaluate your models and surface
Synthetic data generators for structured and unstructured text
The open-source tool for building high-quality datasets
Data science on data without acquiring a copy
Pythonic tool for running machine-learning/high performance workflows
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
Training data (data labeling, annotation, workflow) for all data types
Panda-Helper: Data profiling utility for Pandas DataFrames and Series
The standard data-centric AI package for data quality and ML
Create HTML profiling reports from pandas DataFrame objects
Make your own running home page
An orchestration platform for the development, production
Project structure for doing and sharing data science work
Easy integration with Athena, Glue, Redshift, Timestream, Neptune
airda(Air Data Agent
Best practices on recommendation systems
A real-time visualisation of the CO2 emissions of electricity
Always know what to expect from your data
The open standard for data logging
A curated list of data mining papers about fraud detection
AutoGluon: AutoML for Image, Text, and Tabular Data