Summingbird
Streaming MapReduce with Scalding and Storm
...Its aim is to let developers express data aggregation pipelines in a unified way, where the same logic can run either in real time (stream) or in batch mode, and the results can be merged or reconciled. In effect, Summingbird abstracts over multiple execution engines (such as Storm, Scalding, etc.) to provide one high-level program that composes transformations and aggregations, and then executes them in different runtime contexts. It is particularly useful in analytics or metrics systems where you want to update counters or aggregates continuously but also periodically recompute from historical data. Summingbird manages consistency and merging between the real-time and batch paths to avoid double-counting or data loss.