A framework for real-life data science
re_data - fix data issues before your users & CEO would discover them
An orchestration platform for the development, production
Training data (data labeling, annotation, workflow) for all data types
Train machine learning models within Docker containers
Panda-Helper: Data profiling utility for Pandas DataFrames and Series
Automatic extraction of relevant features from time series
Clean Jupyter notebooks of outputs, metadata, and empty cells
Algorithms from circuit theory to predict connectivity
A package for Counterfactual Explanations and Algorithmic Recourse
The open standard for data logging
TIGRE: Tomographic Iterative GPU-based Reconstruction Toolbox
An optimized graphs package for the Julia programming language
Collaborative forensic timeline analysis
Stream Processing and Complex Event Processing Engine
Scale your Pandas workflows by changing a single line of code
Julia extension for Visual Studio Code
Pandas on AWS, easy integration with Athena, Glue, Redshift, etc.
Production-ready data processing made easy and shareable
Open-source data observability for analytics engineers
Benchmarking synthetic data generation methods
Scalable and Flexible Gradient Boosting
Code review for data in dbt
Library providing end-to-end GPU-accelerated recommender systems
Uncover insights, surface problems, monitor, and fine tune your LLM