From: Anthony S. <sc...@gm...> - 2012-05-14 19:51:37
|
Hi Johann, Thanks for bring this up. I believe that I have determined that this is not a PyTables / pthreads issue. Doing some profiling npoints=1000000, I found that most of the time (97%) was being spent in the sum() call (see below). This ratio doesn't change much with different values of npoints. Since there is no implicit parallelism here, I would recommend using numpy.sum() instead of Python's. I hope this helps. If you need other tips on speeding up the sum operation, please let us know. Be Well Anthony Timer unit: 1e-06 s File: pytables_expr_test.py Function: fn at line 66 Total time: 1.63254 s Line # Hits Time Per Hit % Time Line Contents ============================================================== 66 def fn(p, h5table): 67 ''' 68 actual function we are going to minimize. It consists of 69 the pytables Table object and a list of parameters. 70 ''' 71 1 14 14.0 0.0 uv = h5table.colinstances 72 73 # store parameters in a dict object with names 74 # like p0, p1, p2, etc. so they can be used in 75 # the Expr object. 76 4 21 5.2 0.0 for i in xrange(len(p)): 77 3 19 6.3 0.0 k = 'p'+str(i) 78 3 14 4.7 0.0 uv[k] = p[i] 79 80 # systematic shift on b is a polynomial in a 81 1 4 4.0 0.0 db = 'p0 * a*a + p1 * a + p2' 82 83 # the element-wise function 84 1 6 6.0 0.0 fn_str = '(a - (b + %s))**2' % db 85 86 1 16427 16427.0 1.0 expr = Expr(fn_str,uservars=uv) 87 1 21438 21438.0 1.3 expr.eval() 88 89 # returning the "sum of squares" 90 1 1594600 1594600.0 97.7 return sum(expr) On Mon, May 14, 2012 at 1:59 PM, Johann Goetz <jg...@uc...> wrote: > SHORT VERSION: > > Please take a look at the fn() function in the attached file (pasted > below). When I run this with 10M events or more I notice that the total CPU > usage never goes above the percentage I get using single-threaded eval(). > Am I at some other limit or can I improve performance by doing something > else? > > LONG VERSION: > > I have been trying to use the tables.Expr object to speed up a > sophisticated calculation over an entire dataset (a pytables Table object). > The calculation took so long that I had to create a simple example to make > sure I knew what I was doing. I apologize in advance for the lengthy code > below, but I wanted the example to mimic exactly what I'm trying to do and > to be totally self-contained. > > I have attached a file (and pasted it below) in which I create a hdf5 file > with a single large Table of two columns. As you can see, I'm not worried > about writing speed at all - I'm concerned about read speed. > > I would like to draw your attention to the fn() function. This is where I > evaluate a "chi-squared" value on the dataset. My strategy is to populate > the "h5table.colinstances" dict object with several parameters which I call > p0, p1, etc and then create the Expr object using these and the column > names from the Table. > > If I create 10M rows (77 MB file) in the Table (with the command below), > the evaluation seems to be CPU bound (one of my cores is at 100% - the > others are idle) and it takes about 7 seconds (about 10 MB/s). Similarly, I > get about 70 seconds for 100M events. > > python pytables_expr_test.py 10000000 > python pytables_expr_test.py 100000000 > > So my question: It seems to me that I am not fully using the CPU power > available on my computer (see next paragraph). Am I missing something or > doing something wrong in the fn() function below? > > A few side-notes: My hard-disk is capable of over 200 MB/s in sequential > reading (sustained and tested with large files using the iozone program), I > have two 4-core CPU's on this machine but the total CPU usage during eval() > never goes above the percentage I get using single-threaded mode with > "numexpr.set_num_threads(1)". > > I am using pytables 2.3.1 and numexpr 2.0.1 > > -- > Johann T. Goetz, PhD. <http://sites.google.com/site/theodoregoetz/> > jg...@uc... > Nefkens Group, UCLA Dept. of Physics & Astronomy > Hall-B, Jefferson Lab, Newport News, VA > > > ### BEGIN file: pytables_expr_test.py > > from tables import openFile, Expr > > ### Control of the number of threads used when issuing the > ### Expr::eval() command > #import numexpr > #numexpr.set_num_threads(2) > > def create_ntuple_file(filename, npoints, pmodel): > ''' > create an hdf5 file with a single table which contains > npoints number of rows of type row_t (defined below) > ''' > from numpy import random, poly1d > from tables import IsDescription, Float32Col > > class row_t(IsDescription): > ''' > the rows of the table to be created > ''' > a = Float32Col() > b = Float32Col() > > def append_row(h5row, pmodel): > ''' > consider this a single "event" being appended > to the dataset (table) > ''' > h5row['a'] = random.uniform(0,10) > > h5row['b'] = h5row['a'] # reality (or model) > h5row['b'] = h5row['b'] - poly1d(pmodel)(h5row['a']) # systematics > h5row['b'] = h5row['b'] + random.normal(0,0.1) # noise > > h5row.append() > > h5file = openFile(filename, 'w') > h5table = h5file.createTable('/', 'table', row_t, "Data") > h5row = h5table.row > > # recording data to file... > for n in xrange(npoints): > append_row(h5row, pmodel) > > h5file.close() > > def create_ntuple_file_if_needed(filename, npoints, pmodel): > ''' > looks to see if the file is already there and if so, > it makes sure its the right size. Otherwise, it > removes the existing file and creates a new one. > ''' > from os import path, remove > > print 'model parameters:', pmodel > > if path.exists(filename): > h5file = openFile(filename, 'r') > h5table = h5file.root.table > if len(h5table) != npoints: > h5file.close() > remove(filename) > > if not path.exists(filename): > create_ntuple_file(filename, npoints, pmodel) > > def fn(p, h5table): > ''' > actual function we are going to minimize. It consists of > the pytables Table object and a list of parameters. > ''' > uv = h5table.colinstances > > # store parameters in a dict object with names > # like p0, p1, p2, etc. so they can be used in > # the Expr object. > for i in xrange(len(p)): > k = 'p'+str(i) > uv[k] = p[i] > > # systematic shift on b is a polynomial in a > db = 'p0 * a*a + p1 * a + p2' > > # the element-wise function > fn_str = '(a - (b + %s))**2' % db > > expr = Expr(fn_str,uservars=uv) > expr.eval() > > # returning the "sum of squares" > return sum(expr) > > if __name__ == '__main__': > ''' > usage: > python pytables_expr_test.py [npoints] > > Hint: try this with 10M points > ''' > from sys import argv > from time import time > > npoints = 1000000 > if len(argv) > 1: > npoints = int(argv[1]) > > filename = 'tmp.'+str(npoints)+'.hdf5' > > pmodel = [-0.04,0.002,0.001] > > print 'creating file (if it doesn\'t exist)...' > create_ntuple_file_if_needed(filename, npoints, pmodel) > > h5file = openFile(filename, 'r') > h5table = h5file.root.table > > print 'evaluating function' > starttime = time() > print fn([0.,0.,0.], h5table) > print 'evaluated file in',time()-starttime,'seconds.' > > #EOF > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > > |