From: Francesco D. D. <ke...@li...> - 2005-09-02 19:39:18
|
Francesc Altet wrote: >Hi Francesco, > >This problem is related with slowness of element-by-element assignment >in numarray objects. If you want to achieve big performance for writing >PyTables, it is better that you use the Table.append method (instead of >Row.append). > >I normally use the next code: > > def fill_arrays(self, start, stop): > "Some generic filling function" > arr_f8 = numarray.arange(start, stop, type=numarray.Float64) > arr_i4 = numarray.arange(start, stop, type=numarray.Int32) > if self.userandom: > arr_f8 += random_array.normal(0, stop*self.scale, > shape=[stop-start]) > arr_i4 = numarray.array(arr_f8, type=numarray.Int32) > return arr_i4, arr_f8 > > def fill_table(self, con): > "Fills the table" > table = con.root.table > j = 0 > for i in xrange(0, self.nrows, self.step): > stop = (j+1)*self.step > if stop > self.nrows: > stop = self.nrows > arr_i4, arr_f8 = self.fill_arrays(i, stop) > recarr = records.fromarrays([arr_i4, arr_f8]) > table.append(recarr) > j += 1 > table.flush() > >in order to fill a table with two columns (Int32 and Float32). > >If you try this, I'm sure you will get much better results. > > Yes, if i build numarray.arrays for single columns, then build a recarray and append all, the performance increase of a factor of 10th, but only when i use numeric values. For a table, composed by 2 columns, an integer and a float, i've reached 367KRows/s... very good! But performance on chararrays is very poor, in comparison to numeric ones. Adding a CharArray of 10^6 elements of 1 byte drop the performance to 100Krows/s, and add two CharArray of byte, drop to 30Krows/s. It also seems independent by maximum string length. I'm builing arrays from lists with: numarray.array(list, shape=n of Rows, type=type of Row) for numeric values numarray.strings.array(list of strings, shape = n of Rows, itemsize = maxLength of row) for strings. Is a memory move issue? Thanks, FrancescoDD |