CKAN is an open-source DMS for powering data hubs
Uncover insights, surface problems, monitor, and fine tune your LLM
An orchestration platform for the development, production
Monitor the stability of a Pandas or Spark dataframe
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
A curated list of data mining papers about fraud detection
A real-time visualisation of the CO2 emissions of electricity
Clean Jupyter notebooks of outputs, metadata, and empty cells
Making DAG construction easier
Streamline your ML workflow
Panda-Helper: Data profiling utility for Pandas DataFrames and Series
Make your own running home page
The standard data-centric AI package for data quality and ML
Create HTML profiling reports from pandas DataFrame objects
airda(Air Data Agent
Detecting silent model failure. NannyML estimates performance
Data science on data without acquiring a copy
A more accurate representation of jupyter notebooks
The open-source tool for building high-quality datasets
An open source multi-tool for exploring and publishing data
Python implementation of global optimization with gaussian processes
Training data (data labeling, annotation, workflow) for all data types
Production-ready data processing made easy and shareable
The toolkit to test, validate, and evaluate your models and surface
Open-source data observability for analytics engineers