CellMR Code

Brought to you by: apapag
Tree [r1] /

History

HTTPS access
File	Date	Author	Commit
benchmarks	2009-09-22	apapag	[r1]
cellConnector	2009-09-22	apapag	[r1]
mapReduce	2009-09-22	apapag	[r1]
Compile	2009-09-22	apapag	[r1]
README	2009-09-22	apapag	[r1]
Read Me

Introduction
	The Cell Connector libraries are designed to help stream large amounts of data from a head node to a cluster of worker nodes. It was designed with a many core 64-bit x86 head node (specifically an 8-core Opteron machine) and a cluster of Playstation 3's which are based on the PowerPC Cell architecture.
	
The libraries and requirements
	Since two different architectures are involved, two libraries are compiled by Cell Connector. The programmer is still required to write two programs, one for the head node and one for the cluster nodes, however when each program includes their respective Cell Connector library, they are provided with functions that take care of all the data streaming. The only thing the programmer has to write on the head node is the initial set up before the streaming begins, the reduce function so each cluster node's output can be combined into a single output, and some finalization (if necessary) code once the whole operation has completed. On the cluster node code the programmer also has to provide initialization and finalization code, however this code is typically less than the initialization and finalization code on the head node, and specify what kind of work has to be done on each chunk of data received. In the examples in this repository all computational work is done by invoking MapReduce. This is a simple way to utilize all the SPEs in the Cell BE and offers reasonable performance. Each user written program has to only include the correct Cell Connector header file (either cellConnector_ppc.h or cellConnector_x86.h) and also specify which library to include in the Makefile.
	
Communication and data types
	Although this implementation of Cell Connector uses LAMMPI, it hides this from the user so any communication mechanism could be implemented with no change to the programs written for Cell Connector. Due to this a few helper functions and data types are given to the programmer. Data types (such as unsigned int, long long, etc.) are given macros in the form of CELL_CON_* (such as CELL_CON_INT and CELL_CON_LLONG) so the communication mechanism knows what it is transporting and can perform the appropriate endian conversions on the fly. Unsigned is given as a flag called CELL_CON_UNSIGNED that can be OR'd with any other data type to change the type to unsigned. These data types are simply 1 byte chars that are defined inside of the common/cellConnector.h file. 
	Cell Connector also gives the programmer the ability to define their own data types to support custom structs. To do this, the createType function is called and returned is the new custom data type. The function requires 5 parameters: the number of variables in this data type; an array specifying each block length for the variables; an array of indices for each variable specifying how many bytes into the struct each variable is located; an array of old data types describing what data type each variable is; and finally an unsigned int of the size of the data type (a sizeof for the struct its based on will work). 
	Right now there is a limit of 10 custom data types however there is no read this can't be increased or made dynamic in the future. It is up to the user to store the generated data type in a character in their program in order to use it later on. Also, any data types that are defined on the head node must also be manually defined on the cluster node. This can be changed later on but for the moment it is a simple work around as the programmer can just copy and paste the create data type line.
	
Using Cell Connector on the head node
	The basic Cell Connector function is the startConnector function. On the head node the startConnector function takes in 4 parameters. The first is a pointer to a buffer of data to stream through the cluster nodes. The second is the data type of this buffer, of the type described in the previous paragraph. The third is the number of the previously specified data types in the buffer. The last parameter specifies how much each cluster node should get. This function blocks as the data is split up and distributed to all cluster nodes. Once the whole buffer has been streamed it returns.
	The programmer needs to define a reduce function with 3 parameters. The first is a pointer to the received buffer from the cluster node. The second is the data type of the buffer. The last parameter is the number of data types in the buffer received. This function is called inside of a thread on the head node. Each cluster node gets its own thread on the head node that handles all output from that cluster node. 

Using Cell Connector on a cluster node
	On the cluster node the programmer also has to call the startConnector function when it has done any initialization that might be required, but no parameters are needed. Instead it receives all the information it needs from the head node. However the programmer does have to provide an offloadBuffer function with three parameters: a void * buffer parameter; a char * data type parameter; and a int * size parameter. These parameters are input and output parameters so the programmer can change these after the data is changed by the computational kernel on the cluster node but before the function returns. 
	
Cluster-wide parameters
	As an extra feature, parameters can be distributed to all the cluster nodes from the head node before any data streaming occurs. Calling the addParameter function on the head node will send the parameter specified in the arguments to each cluster when the startConnector function begins. The addParameter function takes in three arguments: a pointer to the parameter; the data type of the parameter; and the number of data types that should be consider as the parameter. For example, if I wanted to create a parameter of a file on disk, I would create a char array of that file location, give addParameter this array's starting address, data type (CELL_CON_CHAR), and the number of characters that are in that string. Each cluster node can access the parameters by using the params array. So this file location would be stored in params[0] if it is the first parameter added.
CellMR Code

Tree [r1] / Download Snapshot History

Read Me

Tree [r1] /

History