A tool for semi-automatic cell type classification, harmonization
High-Performance Symbolic Regression in Python and Julia
Detecting silent model failure. NannyML estimates performance
Pandas on AWS, easy integration with Athena, Glue, Redshift, etc.
A framework for real-life data science
A more accurate representation of jupyter notebooks
Data integration platform for ELT pipelines from APIs, databases
Progress bars for threading and multiprocessing tasks on terminal
Easy integration with Athena, Glue, Redshift, Timestream, Neptune
Pythonic tool for running machine-learning/high performance workflows
WebGL-based viewer for volumetric data
Python implementation of global optimization with gaussian processes
The toolkit to test, validate, and evaluate your models and surface
Graphical User Interface Toolkit for Python with minimal dependencies
Automatically find issues in image datasets
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
Docker image used to run data processing workloads
Synthetic data generators for structured and unstructured text
The open-source tool for building high-quality datasets
Data science on data without acquiring a copy
Experimental Julia implementation of the Amazon Braket SDK
CKAN is an open-source DMS for powering data hubs
Uncover insights, surface problems, monitor, and fine tune your LLM
The standard data-centric AI package for data quality and ML
Training data (data labeling, annotation, workflow) for all data types