Dataset Management Framework, a Python library and a CLI tool to build
Python ETL framework for stream processing, real-time analytics, LLM
Python Stream Processing
Spatial data processing for geomodeling
The open standard for data logging
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
A Python toolbox for gaining geometric insights
Visualize and compare datasets, target values and associations
Burp Suite extension for JavaScript static analysis
An AI-powered data science team of agents
Clean Jupyter notebooks of outputs, metadata, and empty cells
High-Performance Symbolic Regression in Python and Julia
Making DAG construction easier
Light-weight, flexible, expressive statistical data testing library
Synthetic data generators for structured and unstructured text
3D plotting and mesh analysis through a streamlined interface
Always know what to expect from your data
Project structure for doing and sharing data science work
Easy integration with Athena, Glue, Redshift, Timestream, Neptune
Detecting silent model failure. NannyML estimates performance
Training data (data labeling, annotation, workflow) for all data types
AutoGluon: AutoML for Image, Text, and Tabular Data
Data science on data without acquiring a copy
Train machine learning models within Docker containers
Best practices on recommendation systems