Data science on data without acquiring a copy
Synthetic data generators for structured and unstructured text
Data integration platform for ELT pipelines from APIs, databases
Build, run, and manage data pipelines for integrating data
The open standard for data logging
Uncover insights, surface problems, monitor, and fine tune your LLM
AI-data warehouse to enrich, transform and analyze unstructured data
A reactive notebook for Python
High-Performance Symbolic Regression in Python and Julia
Project structure for doing and sharing data science work
A multi-cloud framework for big data analytics
The toolkit to test, validate, and evaluate your models and surface
Python implementation of global optimization with gaussian processes
A Python package for interactive geospaital analysis and visualization
Train machine learning models within Docker containers
Panda-Helper: Data profiling utility for Pandas DataFrames and Series
A more accurate representation of jupyter notebooks
An AI-powered data science team of agents
Concurrent Python made simple
Detecting silent model failure. NannyML estimates performance
Spatial data processing for geomodeling
Monitor the stability of a Pandas or Spark dataframe
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
Benchmarking synthetic data generation methods
Light-weight, flexible, expressive statistical data testing library