[C-MPI-commits] SF.net SVN: c-mpi:[167] docs/manual/manual.txt

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Revision: 167
          http://c-mpi.svn.sourceforge.net/c-mpi/?rev=167&view=rev
Author:   jmwozniak
Date:     2011-04-14 23:53:21 +0000 (Thu, 14 Apr 2011)

Log Message:
-----------
New manual in asciidoc

Added Paths:
-----------
    docs/manual/manual.txt

Added: docs/manual/manual.txt
===================================================================

--- docs/manual/manual.txt	                        (rev 0)
+++ docs/manual/manual.txt	2011-04-14 23:53:21 UTC (rev 167)
@@ -0,0 +1,528 @@
+
+C-MPI User Guide
+================
+Justin M. Wozniak <wo...@mc...>
+v0.1, April 2011
+
+
+Overview
+--------
+
+This manual provides reference material for the Content-MPI (C-MPI) DHT.
+
+C-MPI provides a key/value store for distributed computing over MPI.
+
+Quick Start
+-----------
+
+The fastest way to get a quick overview of provided features is to just run:
+
+--------------------------------------------------------
+./setup.sh
+./configure --config-cache --enable-table-dense-1 \
+                           --enable-tests \
+                           --with-mpi=${HOME}/sfw/mpich2-1.2.1p1
+make D=1 test_results
+--------------------------------------------------------
+
+Then, just take a look at the test code and output to see how things work.
+
+
+Use Cases
+---------
+
+MPI Library
+~~~~~~~~~~~
+
+C-MPI can be used as an MPI library. In this mode, the user allocates
+some number of DHT nodes and DHT clients. The nodes start up and begin
+listening for requests. The clients call into +cmpi_client_code()+, do
+work, and make C-MPI calls.
+
+More advanced user programs can link to +libcmpi.a+ directly as well
+by imitating +cmpi/client.c+.
+
+--------------------------------------------------------
+#include <cmpi.h>
+
+cmpi_client_code()
+{
+  sprintf(key,   "key_%i",   mpi_rank);
+  sprintf(value, "value_%i", mpi_rank);
+  cmpi_put(key, value, strlen(value)+1);
+
+  rank = (mpi_rank+1)%mpi_size;
+  sprintf(key, "key_%i", rank);
+  cmpi_get(key, &result, &length);
+  printf("result(%i): %s\n", length, result);
+}
+--------------------------------------------------------
+
+Cluster database
+~~~~~~~~~~~~~~~~
+
+In this mode, the DHT runs as a distributed background process
+(+cmpi-db+) and the user connects to it via a cp-like tool
+(+cmpi-cp+).
+
+Commands executed on submit host:
+--------------------------------------------------------
+#!/bin/sh
+
+# Allocate 6 compute nodes
+qsub -t 1-6 ...
+
+# List node host names to file for mpiexec
+qstat | something > hosts
+
+# Create some initial conditions
+create_initial > input
+
+# Launch the DHT (12 processes on 6 nodes)
+mpiexec.hydra -f hosts -n 12 bin/cmpi-db -n 6 &
+
+# Spawn client scripts
+# (Could use Falkon here?)
+id=0
+total=$( wc -l hosts )
+for host in $( cat hosts )
+do
+  ssh ${host} client_script.sh $(( id++ )) ${total} &
+done
+
+wait
+--------------------------------------------------------
+
+client_script.sh:
+
+--------------------------------------------------------
+#!/bin/sh
+
+id=$1
+nodes=$2
+
+for (( i=0 ; i<10 ; i++ ))
+do
+  # Do some useful work
+  do_work ${id} ${nodes} < input > output
+  # Post the results to the DHT
+  key=output_${id}_${i}
+  cmpi-cp output dht://${key}
+  # Read a neighbor's result
+  key=output_$(( id-1 % nodes ))_${i}
+  cmpi-cp dht://${key} input
+done
+--------------------------------------------------------
+
+MPI-IO implementation
+~~~~~~~~~~~~~~~~~~~~~
+
+The CMPI-IO module is really only an sketch. Here's the idea:
+
+. C-MPI provides a patch for MPICH. The user applies the patch and recompiles MPICH.
+
+. The user writes a normal MPI/MPI-IO program but provides cmpi:/ pathnames to trigger the CMPI-IO implementation.
+
+. The user launches the program with mpiexec, allocating extra processes for the DHT.
+
+. Calls like MPI_FILE_WRITE_AT() are translated by ROMIO and
+implemented ultimately by calls like ADIOI_CMPI_WriteContig(). (The
+full list of what CMPI-IO needs to implement is in +ad_cmpi.h+; it's
+actually not that bad.) This method would be implemented via calls to
++cmpi_put()+/+cmpi_get()+.
+
+FUSE module
+~~~~~~~~~~~
+
+A FUSE adapter could be built using functionality similar to the
+techniques above. The user would first instantiate the cmpi-db. Then,
+the FUSE implementation would make calls using the driver interface
+similar to the way that cmpi-cp does.
+
+MPI-RPC
+~~~~~~~
+
+Overview
+^^^^^^^^
+
+The MPIRPC component is used to allow the user to issue multiple
+asynchronous requests in one thread from a higher-level, RPC,
+event-driven model.
+
+Definitions
+^^^^^^^^^^^
+
+MPIRPC object::
+
+Created by an MPIRPC call. Can be waited on. On call completion,
+this is passed to the proceed-function. Contains the result of the
+call on completion. Must be freed by the user with MPIRPC_Free().
+
+proceed-function::
+
+A user function pointer passed into an MPIRPC call. Called by MPIRPC
+on call completion with the MPIRPC object.  This allows the caller
+to make progress after the call completed, and obtain the result.
+
+handler::
+
+A user function that is the target of an MPIRPC call. Called by MPIRPC
+on an incoming request.
+
+MPIRPC_Call::
+
++MPIRPC_Call()+ creates an MPIRPC object and starts performing the RPC
+asynchronously. The MPIRPC object will contain the results of the call
+when complete. The arguments are:
+
++MPIRPC_Node target+;;
+
+The target +MPIRPC_Node+, which is an MPI communicator and rank.
+
++char* name+;;
+
+Remote function name Copied into the MPIRPC object. Limited to
+MPIRPC_MAX_NAME (128) characters. May not be +NULL+.
+
++char* args+;;
+
+Short +NULL+-terminated string for user control data arguments. Copied
+into the MPIRPC object. Limited to +MPIRPC_MAX_ARGS+ (256)
+characters. May be +NULL+.
+
++void* extras+;;
+
+Extra user state accessible by the proceed-function.
+
++void (*proceed)(MPIRPC*)+;;
+
+The proceed-function.
+
+The MPIRPC Object::
+
+The MPIRPC object contains:
+
++MPIRPC_Node target+;;
+
+    The target MPIRPC_Node
+
++status+;;
+
+The status of the call: +MPIRPC_STATUS_PROTO+, +MPIRPC_STATUS_CALLED+,
+or +MPIRPC_STATUS_RETURNED+.
+
++char[] name+;;
+
+Copy of the remote procedure name.
+
++char[] args+;;
+
+Copy of the user argument string.
+
++char* blob+;;
+
+Pointer to the user data blob.
+
++int blob_length+;;
+
+Length of the user data blob.
+
++void* result+;;
+
+Pointer to result data returned by remote procedure in fresh storage.
+
++int result_length+;;
+
+Length of result data.
+
++void* extras+;;
+
+Extra user pointer useful for proceed-function.
+
++int unique+;;
+
+Internal uniquifier.
+
++bool cancelled+;;
+
+Not yet used.
+
++MPIRPC_Node target+;;
+
+The target MPIRPC_Node
+
++MPI_Request request[]+;;
+
+Internal MPI objects. Released by +MPIRPC_Free()+.
+
+Example
+
+sample code
+
+Usage notes
+~~~~~~~~~~~
+
+* Handler routines: The handler must copy the incoming args if it
+  wants to save them. The handler must return by calling
+  +MPIRPC_Return()+ or +MPIRPC_Null()+. Handlers can call into
+  +MPIRPC_Call()+ but the flow eventually return to the original
+  caller.
+
+C-MPI
+-----
+
+Overview
+~~~~~~~~
+
+
+C-MPI is intended to be an easy to use MPI-based distributed key/value store.
+
+API
+~~~
+
+The C-MPI API:
+
++cmpi_init()+::
+
+Initialize C-MPI. The user must first call +MPI_Init()+ and
++MPIRPC_Init()+.
+
++cmpi_put()+::
+
+Post a key/value pair in C-MPI.
+
+Arguments:
+
++char* key+;;
+
++NULL+-terminated string key. Passed as the args argument to
++MPIRPC_Call_blob()+.
+
++void* value+;;
+
+Variable-length user data. Passed as the blob argument to
++MPIRPC_Call_blob()+.
+
++int length+;;
+
+Byte-length of value.
+
++cmpi_update()+::
+
+Update the value of a key/value pair in C-MPI.
+
+Arguments:
+
++char* key+;;
+
++NULL+-terminated string key. Passed as the args argument to MPIRPC_Call_blob().
+
++void* value+;;
+
+Variable-length user data. Passed as the blob argument to MPIRPC_Call_blob().
+
++int length+;;
+
+Byte-length of value.
+
++int offset+;;
+
+At which point to begin overwrite. Offset+length may exceed the
+original value length. The key/value pair need not originally exist,
+but if it does not, the offset must be 0.
+
++cmpi_get()+::
+
+Extract the value of a key/value pair into fresh storage.
+
+Arguments:
+
++char* key+;;
+
++NULL+-terminated string key. Passed as the args argument to MPIRPC_Call_blob().
+
++void** result+;;
+
+Will be set to the location of the extracted data in fresh storage.
+
++int* length+;;
+
+Will be set to the length of the extracted data.
+
+Configure, build, test
+----------------------
+
+Configure
+~~~~~~~~~
+
+. Run ./setup.sh.
+
+. Run ./configure.
+
++--with-mpi+::
+
+Mandatory. The location of your MPICH installation. E.g., ${HOME}/sfw/mpich2-1.1.1p1-x86_64
+
+--enable-table-*::
+
+Mandatory. The underlying DHT algorithm. Current options:
+
+--enable-table-dense-1;;
+
+Simple dense table defined in src/dense-1. Uses monolithic
+communicator. Hashes keys and assigns to nodes by modulus. Not really
+a DHT; doesn't need neighbor tables. Great for debugging, simple and
+fast.
+
+--enable-table-kda-2A;;
+
+Kademlia implementation defined in +src/kda-2+, linked with
++src/kda-2/conn-A.c+. Uses monolithic communicator.
+
+--enable-table-kda-2B;;
+
+Kademlia implementation defined in +src/kda-2+, linked with
++src/kda-2/conn-B.c+. Uses MPI-2 dynamic processes.
+
+--enable-mode-*;;
+
+The node layout scheme. (see <<cmpi-modes>>).
+
+--enable-mode-mono;;
+
+The default.
+
+--enable-mode-pairs
+
+--enable-tests
+
+Turn on support for tests. When compiling tests, be sure to use make D=1.
+
+--enable-cmpi-io
+
+Turn on support for the skeletal CMPI-IO component.
+
+Build
+~~~~~
+
+Each component defines its build behavior in a +module.mk.in+
+file. These are converted to +module.mk+ by +configure+ (or
++config.status+). These components typically append to variables or
+define additional targes. +Makefile+ includes all +module.mk+s and
+manages the whole build.
+
+Functions like +cmpi_get()+ are defined in multiple places
+(+cmpi_dense.c+, +cmpi-kademlia.c+). The choice of which to compile is
+made at configure time.
+
+Useful arguments to make:
+
+D=1::
+
+    Turn on debugging output. Mandatory for many tests.
+V=1::
+
+    Turn on verbose build output. Useful when debugging build process.
+
+-j::
+
+You can do make -j tests but you cannot running the tests
+concurrently is not recommended.
+
+mpirpc::
+
+Build a stand-alone MPI-RPC lib.
+
+cmpi::
+
+Build a stand-alone C-MPI lib.
+
+tests::
+
+Build (but do not run) the test programs.
+
+test/<module>/test-success.out::
+
+Run all the tests for module after all of the tests for its dependencies.
+
+tags::
+
+Make an etags file based on the configuration choices.
+
+Test
+~~~~
+
+The tests are defined for each component in +module.mk+. For each
++test-*.c+, a +test-*.x+ executable is produced and launched.  The
+launcher is ; +assert()+s and output parsing are used to confirm
+correctness. Output is collected in +test-*.out+. If the test is run
+from a +test-*.zsh+, debugging output is collected and post-processed
+by the ZSH script. If the test fails, the output is moved to
++test-*.out.failed+ (so make does not consider it).
+
+make D=1 test_results
+
+Build and run all the tests. Requires +./configure --enable-tests+.
+
+Components
+----------
+
+cmpi::
+
+The C-MPI interface. Some reusable functionality is defined.
+
++cmpi_*.c+::
+
+The translation layer. Translates C-MPI calls into calls to a DHT
+implementation.
+
+[[cmpi-modes]]
++mode_*.c+::
+
+C-MPI mode selection made at configure time.
+
+mono;;
+
+Given +mpiexec -n 6 cmpi-db -n 4+, creates 4 DHT nodes (ranks 0-3) and
+2 clients (ranks 4-5). The clients will use all nodes as
+contacts. Ideal for the SMP case.
+
+pairs;;
+
+Given +mpiexec -n 6 cmpi-db -n 3+, creates 3 DHT nodes (ranks 0-2) and
+3 clients (ranks 3-5). The clients connect to a single contact (0:3,
+1:4, 2:5). Ideal for the cluster case: each physical node will have
+one node process and one client process, the client will only contact
+the local node.
+
+node.c
+
+    Reusable C-MPI node.
+
+    Generally, a C-MPI node is anything that calls into cmpi_init().
+client.c
+
+    Reusable C-MPI client.
+
+    Generally, a C-MPI client is anything that calls into cmpi_init_client(). client.c calls into cmpi_client_code(), which is a convenient way to reuse the setup routines in client.c. See the tests for use of cmpi_client_code().
+
+    Other clients could be built that do different things, such as provide filesystem interfaces.
+driver
+
+    Issue commands to a client over a stream interface.
+cmpi-db
+
+    Instantiates several nodes and clients. The clients may be manipulated by a driver.
+cmpi-cp
+
+    Acts as a "user" in the diagram above.
+adts
+
+    Abstract data types: lists, hash tables, etc.
+gossip
+
+    A logging library from Phil Carns.
+
+8. Appendix
+
+Updated: Fri May 21 11:02:44 CDT 2010


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.