benchm-ml
A benchmark of commonly used open source implementations
...It targets large scale settings by varying the number of observations (n) up to millions and the number of features (after expansion) to about a thousand, to stress test different implementations. The benchmarks cover algorithms like logistic regression, random forest, gradient boosting, and deep neural networks, and they compare across toolkits such as scikit-learn, R packages, xgboost, H2O, Spark MLlib, etc. The repository is structured in logical folders, each corresponding to algorithm categories.