Re: [X10-core] general implementation of map/reduce

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi Josh,

	In general, I think the reason to have map/reduce/scan operations in
the library (as opposed to defined by the user in their application via the
normal iteration methods provided by the library) is to give efficient &
parallel implementations of the operations.

	In terms of semantics, I think that means the operations should be
defined to be may parallel (the implementation may implement some
operations in parallel, but may also implement some of them sequentially).
This gives the implementation the most flexibility to heuristically pick an
appropriate task-size to dynamically match the available resources.

	Doing a good job of dividing the work up into chunks and doing them
in parallel takes some amount of clever code, so it is appealing to do it
once and share it between multiple data structures.  However, the trick is
doing that in a way that the abstraction that enables a single map or
reduce implementation to work on multiple collection types doesn't
introduce so much overhead that the performance gained by clever coding of
the parallel work structure gets lost in the sequential overhead of object
allocation, indirection, and interface invocation.

	My intuition is that the general implementation would lose enough
performance that we'd end up with specialized ones for many of the data
structures, but I could be wrong.  It would be interesting to see how much
of a performance difference there is and understand how much of the gap is
fundamental and how much could be fixed via better optimization of the
sequential object-oriented X10 language features.

	For the specific case of Rail, we should bring back map/reduce
functions for Rail before we release 2.4.  Since Rail is a @NativeRep
class, we will probably do that by putting static functions into
x10.util.ArrayUtils (or similar) so that we can write them once in X10.

--dave

Re: [X10-core] general implementation of map/reduce

Performance and Productivity at Scale

Re: [X10-core] general implementation of map/reduce