Menu

SFramePROOF

Attila Krasznahorkay Jr.

Introduction

The SFrame code tries to take full advantage of the PROOF functionality of ROOT. Using PROOF can lead to dramatic speed increases even on a single multi-core machine, but especially when using a large dedicated PROOF cluster. Since PROOF uses parallel execution to speed up processing, it's best to be aware of the basic features of the framework.

This page explains how SFrame uses PROOF internally, and how the SFrame cycles are really executed on a PROOF farm.

Note: Since SFrame-03-00-00 the code from the "old" SFrame-PROOF-branch was migrated back into the main development line (trunk). The old code was moved to a separate branch, called SFrame-classic-branch. This latter is only kept to make it possible to do bugfixes in the old code. But the development of the classic branch is now finished.

PROOF integration

In order to run some code on a PROOF cluster, one has to implement a class inheriting from TSelector. See the following pages for instructions on how to actually do this:

  • The TSelector "reference" page: (http://root.cern.ch/root/html534/TSelector.html)
  • The description in the PROOF manual: (http://root.cern.ch/dupal/content/developing-tselector)

SCycleBaseExec implementation

Here I summarise how the different TSelector functions are implemented in SCycleBaseExec.

SCycleBaseExec::Begin( TTree* )

Here the code "just" makes sure that all the components (the various classes building up SCycleBase) are configured correctly, then it calls the BeginMasterInputData( const SInputData& ) function.

SCycleBaseExec::SlaveBegin( TTree* )

This function also starts by reading in the cycle configuration (which is distributed to the worker nodes over the network), and setting up the components of SCycleBase. Then it checks whether any output TTree-s are defined in the configuration. If at least one is defined, then it checks whether the code is executed on a PROOF cluster, or in LOCAL mode. When running on a PROOF cluster, it opens a new output file using TProofOutputFile, and gives it to SCycleBaseNTuple::CreateOutputTrees(...). If it finds that the cycle is running locally, then it opens an output file "by hand", and puts an instance of SOutputFile to the output object list, which will tell SCycleController where to find it.

When done with all this, it calls the BeginInputData( const SInputData& ) function.

SCycleBaseExec::Init( TTree* )

Here the code just caches the pointer to the "main tree" of the input file. (This is the first "event level" TTree that's defined in the configuration XML.)

SCycleBaseExec::Notify()

The code loads all the input trees which are in the configuration using SCycleBaseNTuple::LoadInputTrees(...), and then calls BeginInputFile( const SInputData& ) so the user's code would connect to all the input TTree branches.

SCycleBaseExec::Process( Long64_t )

The code loads the current entry from all the input TTree-s, calculates the weight of the current event, then calls ExecuteEvent(...). Depending on whether the user code signaled to throw away the event or not (by throwing an SError exception with SError::SkipEvent "level"), the code then either writes out a new entry into the output trees, or just logs the skipping of the event, and doesn't do anything.

SCycleBaseExec::SlaveTerminate()

This function starts by calling EndInputData(...). It then creates an SCycleStatistics object with the information collected during the data processing, and puts it into the cycle's output list. Finally it makes sure that all the output TTree-s are saved into the temporary output file, and closes this file.

SCycleBaseExec::Terminate()

This function only calls EndMasterInputData(...) without doing anything in addition.

Each SFrame package should have a directory called proof/. This directory holds some files that are used to create a .par package from the SFrame package. When you use the [SFrameScripts#sframe_new_package.sh] script, a default version of these files is created automatically. Normally you only have to modify these when doing something "fancy".

BUILD.sh

This file has to be an executable script. In principle you don't ever need to change its contents, the default file should always be usable.

When PROOF compiles a package, it does it by executing this script. This makes it possible to use other build systems than GNU make. The default implementation just executes make in the package. For an example see SFrame/core/proof/BUILD.sh.

SETUP.C

This is a simple ROOT script that takes care of loading all the libraries of the package. The script is basically allowed to do anything, but usually it's just a list of gSystem->Load(...) function calls. Note that if the library compiled by the package needs some extra ROOT libraries, that should also be loaded by the script. For an example see SFrame/core/proof/SETUP.C.

Role of the TSelector and SCycleBase functions

Since the functions of SCycleBaseExec are executed on various machines when running the code on a PROOF cluster, let's review here how the TSelector, and how the SCycleBaseExec function are called by PROOF. First of all, here is a drawing of how the various functions of TSelector are called, when run on a PROOF cluster:

[[File:SFrame-PROOF-schema01.png|500px|thumb|center|PROOF function call schema]]

For the detailed description of how PROOF interacts with TSelector objects, see the PROOF manual. The simplified call tree is the following:

Begin()
SlaveBegin()
Init()
__Notify()
____Process()
____...
____Process()
__...
__Notify()
____Process()
____...
____Process()
SlaveTerminate()
Terminate()

Note: TSelector::Notify() is not shown in the graphical schema. In any case, knowing about that function is not essential to understanding how SFrame works.

As written in the previous section, SCycleBaseExec was written such, that it implements (most of) the TSelector functions. As a result, the function calling schema of SFrame is shown in the following:

[[File:SFrame-PROOF-schema02.png|500px|thumb|center|SFrame function call schema]]

It can be translated as:

BeginCycle()
-------------------------------------
query 1

__BeginMasterInputData()
__BeginInputData()
____BeginInputFile()
______ExecuteEvent()
______...
______ExecuteEvent()
____...
____BeginInputFile()
______ExecuteEvent()
______...
______ExecuteEvent()
__EndInputData()
__EndMasterInputData()

-------------------------------------
query 2

__BeginMasterInputData()
__BeginInputData()
____BeginInputFile()
______ExecuteEvent()
______...
______ExecuteEvent()
____...
____BeginInputFile()
______ExecuteEvent()
______...
______ExecuteEvent()
__EndInputData()
__EndMasterInputData()

-------------------------------------
...
EndCycle()

What you must understand is that if you modify a member variable of your cycle in let's say BeginCycle(), that will have no effect on the state of that variable in '''BeginInputData()'''. This is also the reason why you shouldn't "cache" pointers to output histograms, only if you absolutely know what you're doing. However the '''Retrieve<...>(...)''' and '''Hist(...)''' functions should work in all the functions of a cycle. So on first order you should always access your histograms with one of them, and only do something more sophisticated if there is a performance reason for it.

== Example setup of a small PROOF cluster ==

The purpose of this section is to give an example of how to set up a small number of PCs to act as a PROOF cluster. The configuration files discussed below can be found under: [[NYU-cluster-config]]

=== System configuration ===

I chose to run the PROOF service as a dedicated user, "xrootd". So on each machine I created a user called "xrootd" which has no home directory, and which is not a "login user". (So an '''xrootd''' user can't log into the PCs in the cluster.) The xrootd daemon is started under this user's authentication.

Then I created a directory called '''/home/proof''' on all the machines, and gave the xrootd user write permissions to it. I also created a directory called '''/home/proof/workdir''' on pcnyu01.cern.ch, which can be used as the "ProofWorkDir" directory of the SFrame-PROOF jobs. (This is a directory which is writable by the PROOF cluster, and readable by the user's code.)

In the end I created a directory for the PROOF daemon's log files under '''/var/log/xrootd/'''. (The directory has to be writable for the xrootd user.)

=== PROOF cluster configuration files ===

The configuration files of the NYU CERN PROOF cluster are found under '''/afs/cern.ch/user/k/krasznaa/public/nyu-cluster/'''. Here I summarise what the different parts of the configuration files do.

The configuration is made up of two files:

  • '''proof.conf''': This is the simpler of the two. It lists all the worker nodes that should be started on the PCs of the cluster. You can define any number of worker threads for each machine, but it usually makes sense to define as many workers as many processor cores a given machine has.
  • '''xrootd.cf''': This file sets up everything else. It defines how the various machines in the cluster should behave.

==== proof.conf ====

The role of this file is explained on: [http://root.cern.ch/drupal/content/proof-configuration-file-proofconf ]http://root.cern.ch/drupal/content/proof-configuration-file-proofconf The actual file can be found here: [[NYU-cluster-config#proof.conf]]

Each line in the file describes a different master or worker node. The configuration of the NYU cluster is currently:

master pcnyu01.cern.ch workdir=/home/proof

slave pcnyu01.cern.ch workdir=/home/proof
slave pcnyu01.cern.ch workdir=/home/proof
slave pcnyu01.cern.ch workdir=/home/proof
slave pcnyu01.cern.ch workdir=/home/proof

slave pcnyu02.cern.ch workdir=/home/proof
slave pcnyu02.cern.ch workdir=/home/proof
slave pcnyu02.cern.ch workdir=/home/proof
slave pcnyu02.cern.ch workdir=/home/proof

This means that the master machine is '''pcnyu01.cern.ch''', which also runs 4 worker processes. There is a similar other machine called '''pcnyu02.cern.ch''', which also runs 4 worker processes.

==== xrootd.cf ====

This file was put together from the information found on [http://root.cern.ch/drupal/content/some-useful-configuration-use-cases ]http://root.cern.ch/drupal/content/some-useful-configuration-use-cases and from private discussions with the PROOF developer. The actual file can be found here: [[NYU-cluster-config#xrootd.cf]]

The first two lines just load the library needed to serve files with xrootd, and sets the port on which it will listen:

xrootd.fslib /usr/local/root/lib/root/libXrdOfs.so
xrd.port 1094

'''Note:''' You have to open the 1094 port on each PC of the cluster, otherwise they won't be able to see each others' exported file systems.

In the next lines I specify which directories are to be served by xrootd:

all.export /home/proof r/w
if pcnyu01.cern.ch
all.export /pool0 r/o nomig
fi

This means that all PCs should export their '''/home/proof''' directory, which should be writable, and that the master PC should export its '''/pool0''' directory as a read-only space.

The next few lines set up the basic behavior of the PROOF server.

if exec xrootd
xrd.protocol xproofd:1093 /usr/local/root/lib/root/libXrdProofd.so
fi
xpd.rootsys /usr/local/root
xpd.workdir /home/proof/
xpd.intwait 20
xpd.resource static /afs/cern.ch/user/k/krasznaa/public/nyu-cluster/proof.conf wmx:-1 selopt:roundrobin
xpd.maxoldlogs 5
xpd.maxsessions 3

The lines set up the following things:

  • The local root installation is under '''/usr/local/root''' (All the nodes of the cluster have to be set up this way.)
  • The directory where PROOF should store files for each query is under '''/home/proof'''. You don't need to create a "proof" user for this, it just needs to be an empty directory.
  • The layout of the cluster is described in the '''proof.conf''' file discussed previously. The users are allowed to use all worker nodes for a single query. If the user doesn't need all the workers, the active workers are selected in a round-robin way.
  • The nodes should keep the log files of the last 5 queries of a given user.
  • Only 3 users should be connected at any given time.

The next little part sets the allowed function of the PCs:

if pcnyu01.cern.ch
else
xpd.role worker
fi

This tells PROOF that pcnyu01 can be either a master or a worker (since it's running both kinds of processes), but all the other machines can only be workers.

The next line adds some small security.

xpd.allow pcnyu01.cern.ch

These lines tell PROOF that only pcnyu01 should be allowed to act as a master for the nodes of the cluster.

=== Managing the xrootd daemon ===

Since the cluster is composed of machines running Scientific Linux 5, I use a separate script under '''/etc/init.d/''' to start/stop the daemon on all the nodes. The actual file can be found here: [[NYU-cluster-config#.2Fetc.2Finit.d.2Fxrootd]]

I just took this file from another Wiki page, and adopted it for my own setup. You should just set the location of your local ROOT installation, and the location of the configuration files. Also, you could set up a different location for the log files of the daemon.