Re: [Pytables-users] using pytables for saving realtime video?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Francesc Alted wrote:

>A Dissabte 19 Juny 2004 02:49, Andrew Straw va escriure:
>  
>
>>I am trying to save realtime video (640x480x100 fps uint8 grayscale 
>>data) using PyTables for scientific purposes (lossy compression is 
>>bad).  So, first off, is using PyTables for this task a reasonable 
>>idea?  Although I'm no expert, it seems the compression algorithms that 
>>PyTables offers may be ideal.  It also may be nice to use HDF5 to 
>>incorporate some data.
>>    
>>
>
>I've never thought in such an application for PyTables, but I think that for
>your case (provided that you can't afford lossing information) maybe just
>fine.
>
>  
>
>>Using this code, I get approximately 4 MB/sec with no compression, and 
>>MB/sec with complevel=1 UCL.  This is with an XFS filesystem on linux 
>>    
>>
>
>Mmm... How much using UCL?. Anyway, you may want to try LZO and ZLIB (with
>different compression levels) as well in order to see if this improve the
>speed.
>
>  
>
Sorry, the process never completed while writing the email.   Playing 
around with hdparm, I can now get ~6.5 MB/sec. With no compression, and 
UCL, LZO, and zlib all reduce that rate.

>>So, are there any suggestions for getting this to run faster?
>>    
>>
>
>A couple:
>
>1.- Ensure that your bottleneck is really the call to .append() method by
>commenting it out and doing timings again.
>  
>
Actually, I'm timing purely the call to .append(), which often takes 
seconds.

>2.- EArray.append() method do many checks so as to ensure that you pass an
>object compatible with the EArray being saved. If you are going to pass a
>*NumArray* object that you are sure it's compliant with the underlying
>EArray shape, you can save quite time by calling to the
>._append(numarrayObject) instead of .append(numarrayObject).
>
>If suggestion 2 is not enough (although I'd doubt it), things can be further
>speeded-up by optimizing the number of calls to the underlying HDF5 library.
>However, this must be regarded as a commercial service only (but you can
>always do it by yourself, of course!).
>  
>
That does help a little...

Anyhow, I think using PyTables/HDF5 is too slow for this task -- I can 
easily save at ~50 MB/sec using .write on simple File objects.  So I'll 
use that for now.

Finally, as a suggestion, you may want to incorporate the following code 
into the C source for PyTables, which will allow other Python threads to 
continue running when performing long-running HDF5 tasks.  See 
http://docs.python.org/api/threads.html for more information.

 PyThreadState *_save;
 _save = PyEval_SaveThread();
/* Do work accessing HDF5 library which does not touch the Python API */
PyEval_RestoreThread(_save);

(The file_write function in Objects/fileobject.c in the Python 
sourcecode uses the Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS 
macros to acheive to do the same thing, but do to the error handling of 
PyTables, you'll probably need 2 copies of 
"PyEval_RestoreThread(_save)": one in the normal return path and one in 
the error handling path.)