Easy integration with Athena, Glue, Redshift, Timestream, Neptune
Progress bars for threading and multiprocessing tasks on terminal
Python implementation of global optimization with gaussian processes
Synthetic data generators for structured and unstructured text
Project structure for doing and sharing data science work
The toolkit to test, validate, and evaluate your models and surface
Make your own running home page
Training data (data labeling, annotation, workflow) for all data types
airda(Air Data Agent
The open-source tool for building high-quality datasets
An open source multi-tool for exploring and publishing data
Python module that helps you build complex pipelines of batch jobs
Pythonic tool for running machine-learning/high performance workflows
Main repository for Vispy
Train machine learning models within Docker containers
Integrate multiple high-dimensional datasets with fuzzy k-means
Always know what to expect from your data
Benchmarking synthetic data generation methods
Burp Suite extension for JavaScript static analysis
The standard data-centric AI package for data quality and ML
An AI-powered data science team of agents
Streamline your ML workflow
Collaborative forensic timeline analysis
A real-time visualisation of the CO2 emissions of electricity
Open-source data observability for analytics engineers