A ranked list of awesome Python open-source libraries
Scalable and Flexible Gradient Boosting
Create HTML profiling reports from pandas DataFrame objects
Python Stream Processing
Parallel computing with task scheduling
Distributed messaging and streaming platform with low latency
Making DAG construction easier
https://github.com/JuliaPy/Conda.jl
Simple and distributed Machine Learning
A distributed and extensible workflow scheduler platform
Stream Processing and Complex Event Processing Engine
Scale your Pandas workflows by changing a single line of code
Pandas on AWS, easy integration with Athena, Glue, Redshift, etc.
A lightweight opinionated ETL framework, halfway between plain scripts
Clean Jupyter notebooks of outputs, metadata, and empty cells
3d plotting for Python in the Jupyter notebook
General Mission Analysis Tool
Unified metadata lake for data & AI assets.
A curated list of data mining papers about fraud detection
Kubeflow’s superfood for Data Scientists
Distributed Stream Processing
PyMOL is an OpenGL based molecular visualization system
http://www.sciencedirect.com/science/article/pii/S1047847711003492
Big Data tool
Algorithm Visualizer for IPython/Jupyter Notebook