Open-source data observability for analytics engineers
Python ETL framework for stream processing, real-time analytics, LLM
Python Stream Processing
Spatial data processing for geomodeling
Monitor the stability of a Pandas or Spark dataframe
The open standard for data logging
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
A Python toolbox for gaining geometric insights
Visualize and compare datasets, target values and associations
An AI-powered data science team of agents
Clean Jupyter notebooks of outputs, metadata, and empty cells
High-Performance Symbolic Regression in Python and Julia
Scale your Pandas workflows by changing a single line of code
Making DAG construction easier
Light-weight, flexible, expressive statistical data testing library
Synthetic data generators for structured and unstructured text
The open-source tool for building high-quality datasets
Project structure for doing and sharing data science work
Positron, a next-generation data science IDE
Easy integration with Athena, Glue, Redshift, Timestream, Neptune
Build, run, and manage data pipelines for integrating data
Detecting silent model failure. NannyML estimates performance
AutoGluon: AutoML for Image, Text, and Tabular Data
Data science on data without acquiring a copy
Create HTML profiling reports from pandas DataFrame objects