benchm-ml
A benchmark of commonly used open source implementations
This repository is designed to provide a minimal benchmark framework comparing commonly used machine learning libraries in terms of scalability, speed, and classification accuracy. The focus is on binary classification tasks without missing data, where inputs can be numeric or categorical (after one-hot encoding). It targets large scale settings by varying the number of observations (n) up to millions and the number of features (after expansion) to about a thousand, to stress test different implementations. ...