Home
Name Modified Size InfoDownloads / Week
nettee-0.3.5.tar.gz 2019-11-25 139.4 kB
README.TXT 2019-11-25 7.5 kB
Totals: 2 Items   147.0 kB 0
nettee (NETwork TEE)
Version  0.3.5
25 NOV 2019
David Mathog  <mathog@caltech.edu>
License:  GPL 2

[ nettee is derived from dolly 0.58C by Felix Rauch <rauch@inf.ethz.ch>]

nettee is a network version of the Unix "tee" program.  It is normally used to
push data out through a distribution tree to network nodes.  If helper programs
are provided it may also be used to pull data in from a distribution tree, 
merging/processing it along the way.

See the man pages nettee.1 and nettee_cmd.3 for more info.

To compile, use cmake.  Inspect the CMakeLists.txt and verify that it will install
to the directory you want.  If you want to build and install the example "stub"
programs uncomment them (lines beginning with "###"). Then do:
  mkdir build
  cd build
  cmake ..
  make
  #
  make install
  
Uncomment the stub_child_process and stub_parent_process programs in CMakeLists.txt
if you want to build and install them.  (These are not generally useful - they are 
just example code.) 

Returns status is "EXIT_SUCCESS" if there are no errors and "EXIT_FAILURE" if
any node fails in a manner that cannot be handled.

With version 0.2.0 a helper script "topology_info" is distributed.  It extracts
data from a topology.txt file that describes the local cluster's distribution tree.
This method is used by some of the other scripts here to set up a distribution tree.
The topology.txt file is placed in a directory readable on all nodes.  A different
topology file may be specified with the shell variable NTOPOLOGY.  The output
of topology_info may be fed into cut or extract to pull out the relevant information.
Examples:

% topology_info slave03
slave03.cluster internal slave04
% topology_info slave20
slave20.cluster tip none

To point to a different file. 

The pdist*.sh scripts are example wrappers that use nettee.  For instance:
pdist_file.sh

% pdist_file.sh pdist_store.sh  /tmp/foobar.txt <big_input_file
  
would copy big_input_file to every node in the distribution chain
contained in topology.txt and store it as /tmp/foobar.txt.

% pdist_shell.sh  /tmp/cmdfifo
  
sets up a fifo (cmdfifo) to which commands may be written.  From
there they will be passed to all other nodes on the chain and executed
there more or less in parallel.  Terminate by sending "EOS" to the fifo.
The environmental variable NEXTNODE if included in such a command will
be translated correctly for each node.  Be sure to escape $ as needed
in the command strings!.  On a linux 2.6.8 test system, one master, 20
slaves in chain, 100baseT switched network, this command

accudate ; echo "accudate" >>/tmp/cmdfifo

indicates that there is about a .007s delay before the first node executes
and .009s delay before the last node executes.  The test nodes were synched
with ntp to the master node which issued the above command.  This is a more
efficient way to use nettee to distribute small files.  Ie:

echo "nettee -next \$NEXTNODE -out wherever" >>/tmp/cmdfifo
nettee -root -in smallfile -next $FIRSTNODEINCHAIN

only suffers about .01 s delay in setting up the chain.  Conversely using
pdist_file.sh to distribute small files (one at a time) is slow due to
the rsh setup overhead.

execinput, accudate, and extract, are referred to in the nettee
documentation and/or used by these scripts.  These are
available in source code here:

  https://sourceforge.net/projects/drmtools/


Instructions for using nettee with SystemImager are
in the files SI_METHOD1.TXT and SI_METHOD2.TXT. 

WARNING:  On AMD Athlon processors (the 32 bit ones, not the Athlon64)
the "athcool" program, if active, will degrade throughput
significantly wherever nettee must read from the net and then write
to disk. When transferring large files on these machines it is
best to first do "athcool off", then transfer, then "athcool on".


Please send comments to the email address above.

Files in distribution:

beowulf.master           Cluster imaging script for use with System Imager and nettee
LICENSE                  GPL 2 license
CMakeLists.txt           Cmake build
nettee.1                 nettee man page
nettee.html              nettee html documentation
nettee.c                 nettee source code
nettee_cmd.3             nettee -process interface man page
nettee_cmd.html          nettee -process interface html documentation
nio.c                    nettee cmd library source
nio.h                    nettee cmd library header
pdist_copy.sh            example script for copying files with nettee
pdist_file.sh            example script for copying files with nettee
pdist_gunzip_detar.sh    helper script for pdist_file.sh
pdist_shell.sh           example script for distributing commands with nettee
pdist_store.sh           helper script for pdist_file.sh
rb.c                     ring buffer library source
rb.h                     ring buffer header
README.TXT               This file
stub_child_process.c     Example child program
stub_parent_process.c    Example parent program (runs child)
SI_METHOD1.TXT           Instructions for using nettee with system imager
SI_METHOD2.TXT           Another way of using nettee with system imager
topology_info            example topology script
topology.txt             example data for topology script
test_stub_child_process.sh
                         Runs stub_child_process.


Change log:

0.3.5  modified for sourceforge distribution, fixed minor documentation
and code issues, added cmake build instructions, added example parent/child/script
which were previously only in the man page.  Updated beowulf.master to one
used with a 64 bit systemimager to load CentOS 7.

0.3.4  added -branch.  Fixed a bug in error reporting from child nodes,
and another triggered where multiple hostslists were used and the first failed
to connect to a child node.

0.3.2  fixed two bugs (one in nio.c, one in nio.h) which broke the -process
method.

0.3.1  changed all shutdown() to close().  Added -burst, -sync, and
time stamp options.

0.2.0  Numerous changes.  Added ring buffers, -process option, more
continue on options, -flow (push or pull) and others.

0.1.8  Added -connf flag and failovers for -next.  Modified formatting
of messages slightly.

0.1.7  Minor change to get a clean compilation on solaris.

0.1.6  Added nettee.spec file from dag wieers.  RPM page for nettee for
OS's that are maintained there are:

  http://dag.wieers.com/packages/nettee/
  
Modified error handling on bad count.  Previously if node K+1 returned
a bad byte count node K would return the bad byte count, and the error
condition would be propagated upward in that manner.  In this version
node K returns its own byte count but sets an error flag that propagates
upward instead.  Also the CONWF, COLWF, etc error announcements now include
the name of the preceding node if that node generated the error.
The general idea being to allow the distribution chain to continue
functioning when certain types of errors occur on individual nodes.  
Theoretically these error messages could be piped into something like
"socket" or "logger" and then logged to a central server.  Without
that they would need to be stored some place like /var/run/nettee_messages.txt
on each node and processed in a post mortem.  If -v 1 is used and no
errors occur this would generate no traffic, so it shouldn't interfere
with throughput if everything is working.

Added changes to handle compilation warnings noticed by Tru Huynh on
an X86_64 machine.

Fixed a bug in beowulf.master which sometimes resulted in the end node
not being able to set its HOSTNAME.

Source: README.TXT, updated 2019-11-25