Re: [Pytables-users] Expr performance with Tables on multicore machines

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Johann,

Thanks for bring this up.  I believe that I have determined that this is
not a PyTables / pthreads issue.  Doing some profiling npoints=1000000, I
found that most of the time (97%) was being spent in the sum() call (see
below).  This ratio doesn't change much with different values of npoints.
Since there is no implicit parallelism here, I would recommend using
numpy.sum() instead of Python's.

I hope this helps.  If you need other tips on speeding up the
sum operation, please let us know.

Be Well
Anthony

Timer unit: 1e-06 s

File: pytables_expr_test.py
Function: fn at line 66
Total time: 1.63254 s

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    66                                           def fn(p, h5table):
    67                                               '''
    68                                                   actual function we
are going to minimize. It consists of
    69                                                   the pytables Table
object and a list of parameters.
    70                                               '''
    71         1           14     14.0      0.0      uv =
h5table.colinstances
    72
    73                                               # store parameters in
a dict object with names
    74                                               # like p0, p1, p2,
etc. so they can be used in
    75                                               # the Expr object.
    76         4           21      5.2      0.0      for i in
xrange(len(p)):
    77         3           19      6.3      0.0          k = 'p'+str(i)
    78         3           14      4.7      0.0          uv[k] = p[i]
    79
    80                                               # systematic shift on
b is a polynomial in a
    81         1            4      4.0      0.0      db = 'p0 * a*a  +  p1
* a  +  p2'
    82
    83                                               # the element-wise
function
    84         1            6      6.0      0.0      fn_str = '(a - (b +
%s))**2' % db
    85
    86         1        16427  16427.0      1.0      expr =
Expr(fn_str,uservars=uv)
    87         1        21438  21438.0      1.3      expr.eval()
    88
    89                                               # returning the "sum
of squares"
    90         1      1594600 1594600.0     97.7      return sum(expr)

On Mon, May 14, 2012 at 1:59 PM, Johann Goetz <jg...@uc...> wrote:

> SHORT VERSION:
>
> Please take a look at the fn() function in the attached file (pasted
> below). When I run this with 10M events or more I notice that the total CPU
> usage never goes above the percentage I get using single-threaded eval().
> Am I at some other limit or can I improve performance by doing something
> else?
>
> LONG VERSION:
>
> I have been trying to use the tables.Expr object to speed up a
> sophisticated calculation over an entire dataset (a pytables Table object).
> The calculation took so long that I had to create a simple example to make
> sure I knew what I was doing. I apologize in advance for the lengthy code
> below, but I wanted the example to mimic exactly what I'm trying to do and
> to be totally self-contained.
>
> I have attached a file (and pasted it below) in which I create a hdf5 file
> with a single large Table of two columns. As you can see, I'm not worried
> about writing speed at all - I'm concerned about read speed.
>
> I would like to draw your attention to the fn() function. This is where I
> evaluate a "chi-squared" value on the dataset. My strategy is to populate
> the "h5table.colinstances" dict object with several parameters which I call
> p0, p1, etc and then create the Expr object using these and the column
> names from the Table.
>
> If I create 10M rows (77 MB file) in the Table (with the command below),
> the evaluation seems to be CPU bound (one of my cores is at 100% - the
> others are idle) and it takes about 7 seconds (about 10 MB/s). Similarly, I
> get about 70 seconds for 100M events.
>
> python pytables_expr_test.py 10000000
> python pytables_expr_test.py 100000000
>
> So my question:  It seems to me that I am not fully using the CPU power
> available on my computer (see next paragraph). Am I missing something or
> doing something wrong in the fn() function below?
>
> A few side-notes: My hard-disk is capable of over 200 MB/s in sequential
> reading (sustained and tested with large files using the iozone program), I
> have two 4-core CPU's on this machine but the total CPU usage during eval()
> never goes above the percentage I get using single-threaded mode with
> "numexpr.set_num_threads(1)".
>
> I am using pytables 2.3.1 and numexpr 2.0.1
>
> --
> Johann T. Goetz, PhD. <http://sites.google.com/site/theodoregoetz/>
> jg...@uc...
> Nefkens Group, UCLA Dept. of Physics & Astronomy
> Hall-B, Jefferson Lab, Newport News, VA
>
>
> ### BEGIN file: pytables_expr_test.py
>
> from tables import openFile, Expr
>
> ### Control of the number of threads used when issuing the
> ### Expr::eval() command
> #import numexpr
> #numexpr.set_num_threads(2)
>
> def create_ntuple_file(filename, npoints, pmodel):
>     '''
>         create an hdf5 file with a single table which contains
>         npoints number of rows of type row_t (defined below)
>     '''
>     from numpy import random, poly1d
>     from tables import IsDescription, Float32Col
>
>     class row_t(IsDescription):
>         '''
>             the rows of the table to be created
>         '''
>         a = Float32Col()
>         b = Float32Col()
>
>     def append_row(h5row, pmodel):
>         '''
>             consider this a single "event" being appended
>             to the dataset (table)
>         '''
>         h5row['a'] = random.uniform(0,10)
>
>         h5row['b'] = h5row['a'] # reality (or model)
>         h5row['b'] = h5row['b'] - poly1d(pmodel)(h5row['a']) # systematics
>         h5row['b'] = h5row['b'] + random.normal(0,0.1) # noise
>
>         h5row.append()
>
>     h5file = openFile(filename, 'w')
>     h5table = h5file.createTable('/', 'table', row_t, "Data")
>     h5row = h5table.row
>
>     # recording data to file...
>     for n in xrange(npoints):
>         append_row(h5row, pmodel)
>
>     h5file.close()
>
> def create_ntuple_file_if_needed(filename, npoints, pmodel):
>     '''
>         looks to see if the file is already there and if so,
>         it makes sure its the right size. Otherwise, it
>         removes the existing file and creates a new one.
>     '''
>     from os import path, remove
>
>     print 'model parameters:', pmodel
>
>     if path.exists(filename):
>         h5file = openFile(filename, 'r')
>         h5table = h5file.root.table
>         if len(h5table) != npoints:
>             h5file.close()
>             remove(filename)
>
>     if not path.exists(filename):
>         create_ntuple_file(filename, npoints, pmodel)
>
> def fn(p, h5table):
>     '''
>         actual function we are going to minimize. It consists of
>         the pytables Table object and a list of parameters.
>     '''
>     uv = h5table.colinstances
>
>     # store parameters in a dict object with names
>     # like p0, p1, p2, etc. so they can be used in
>     # the Expr object.
>     for i in xrange(len(p)):
>         k = 'p'+str(i)
>         uv[k] = p[i]
>
>     # systematic shift on b is a polynomial in a
>     db = 'p0 * a*a  +  p1 * a  +  p2'
>
>     # the element-wise function
>     fn_str = '(a - (b + %s))**2' % db
>
>     expr = Expr(fn_str,uservars=uv)
>     expr.eval()
>
>     # returning the "sum of squares"
>     return sum(expr)
>
> if __name__ == '__main__':
>     '''
>     usage:
>         python pytables_expr_test.py [npoints]
>
>     Hint: try this with 10M points
>     '''
>     from sys import argv
>     from time import time
>
>     npoints = 1000000
>     if len(argv) > 1:
>         npoints = int(argv[1])
>
>     filename = 'tmp.'+str(npoints)+'.hdf5'
>
>     pmodel = [-0.04,0.002,0.001]
>
>     print 'creating file (if it doesn\'t exist)...'
>     create_ntuple_file_if_needed(filename, npoints, pmodel)
>
>     h5file = openFile(filename, 'r')
>     h5table = h5file.root.table
>
>     print 'evaluating function'
>     starttime = time()
>     print fn([0.,0.,0.], h5table)
>     print 'evaluated file in',time()-starttime,'seconds.'
>
> #EOF
>
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Pytables-users mailing list
> Pyt...@li...
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>