Efficiently diff rows across two different databases
Orange: Interactive data analysis
Pandas on AWS, easy integration with Athena, Glue, Redshift, etc.
Project structure for doing and sharing data science work
An AI-powered data science team of agents
Fast, flexible and powerful Python data analysis toolkit
Data integration platform for ELT pipelines from APIs, databases
Machine learning in Python
Python data, Leaflet.js maps
Training data (data labeling, annotation, workflow) for all data types
Positron, a next-generation data science IDE
CKAN is an open-source DMS for powering data hubs
The open-source tool for building high-quality datasets
A reactive notebook for Python
WebGL-based viewer for volumetric data
Synthetic data generators for structured and unstructured text
Always know what to expect from your data
Create HTML profiling reports from pandas DataFrame objects
Uncover insights, surface problems, monitor, and fine tune your LLM
matplotlib: plotting with Python
Repository for the Astropy core package
AutoGluon: AutoML for Image, Text, and Tabular Data
Benchmarking synthetic data generation methods
A high-quality tool for convert PDF to Markdown and JSON
Open-source data observability for analytics engineers