Thread: [Pytables-users] .h5 Files robustness

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi, i've a question:

Under certain circumstances, i can running into corrupting of .h5 file,
for instance when machine crash, power failure, etc... , causing
lost of all data (i'm working into gigabytes files).

Perhaps this is an hdf5 question, but i would like to know in what manner
do you manage this in practice.

Some considerations:

1) Copying/rsyncing the file before writing is possible but is very very
slow on big files.
2) Truncating the file to size before writing doesn't resolve the problem,
hdf5 remains corrupted (the header is changed, hdf5 isn't pure append file)

It's possible to implement a recovery/checkpoint system?

I've noticed that hdf5 uses some headers in file
(http://hdf.ncsa.uiuc.edu/HDF5/doc/H5.format.html),
if i try to save (i need to study hdf5 file format) headers before
writing, for instance
in a master register/file, i can achieve recovery?

I would like to write some hdf5/pytables extension to recovery corrupted
files, in your
opinion it's wasted work?

It's obvious that this is possible only if i append data, because if i
change
rows, i need to track also rows changes in file (more difficult task).

Thanks for your job,
Francesco

Thread: [Pytables-users] .h5 Files robustness

pytables-users