A ranked list of awesome Python open-source libraries
Create HTML profiling reports from pandas DataFrame objects
Scalable and Flexible Gradient Boosting
Parallel computing with task scheduling
Python Stream Processing
Making DAG construction easier
Distributed messaging and streaming platform with low latency
Scale your Pandas workflows by changing a single line of code
Simple and distributed Machine Learning
A distributed and extensible workflow scheduler platform
https://github.com/JuliaPy/Conda.jl
Pandas on AWS, easy integration with Athena, Glue, Redshift, etc.
A lightweight opinionated ETL framework, halfway between plain scripts
Clean Jupyter notebooks of outputs, metadata, and empty cells
Stream Processing and Complex Event Processing Engine
3d plotting for Python in the Jupyter notebook
General Mission Analysis Tool
Unified metadata lake for data & AI assets.
A curated list of data mining papers about fraud detection
Kubeflow’s superfood for Data Scientists
Distributed Stream Processing
PyMOL is an OpenGL based molecular visualization system
http://www.sciencedirect.com/science/article/pii/S1047847711003492
Big Data tool
Algorithm Visualizer for IPython/Jupyter Notebook