From: Francesc A. <fa...@py...> - 2004-09-30 15:02:22
|
Someone at the list of HDF5 has answered. It looks like the danger of corruption is real. Matt suggest writing a number of dumps to avoid the worst to happen. Cheers, -- Francesc Alted ---------- Missatge transmes ---------- Subject: RE: Re: [Pytables-users] Danger of file corruption? Date: Dijous 30 Setembre 2004 15:57 From: "Matthew Street" <Mat...@aw...> To: fa...@py..., hd...@nc... Hi, My experience has been that the file does indeed become corrupted, and the data in it unaccessable, if the code is terminated for some reason. We have found this to happen if the code crashes, is removed by a scheduler, and also if the filesystem it is writing to becomes temporarily unavailable. These kinds of issues would affect most file operations though - not just HDF5 ones. A lot of the users I support write timestep dumps, and I always advise writing a number of dumps containing N timesteps, rather than putting them all in one, just so that all data is not lost if the worst happens. I always get them to close and re-open the file between dumps too as leaving it open can be dangerous - I think the H5Fclose routine needs to complete properly for the file to be non-corrupt, if this is the case 'atomic'-type operations probably wont help. Hope this helps, Matt -- _______________________________________________________________________ Matt Street Parallel Technology Support Tel: +44 (0) 118 982 4528 Applied Computer Science AWE, Aldermaston, Reading, RG7 4PR. UK. > -----Original Message----- > From: Francesc Alted > Sent: 30 September 2004 12:42 > > > Hi, > > A PyTables user has sent me couple of questions about HDF5 > behaviour in case of a sudden sutdown (or crash) of the > program that is writing a file. > > I'm forwarding the questions more related with HDF5, > toghether with my answers. Can anybody give us a bit of light > on this issues? > > Thanks, > > ---------- Missatge transmes ---------- > > Subject: Re: [Pytables-users] Danger of file corruption? > (was: Bug? flush() ignores new tables) > Date: Dijous 30 Setembre 2004 11:00 > From: Francesc Alted <fa...@py...> > To: pyt...@li... > > A Dijous 30 Setembre 2004 10:27, Norbert Nemec va escriure: > > * Are the elementary operations of the underlying hdf5 > library atomic? > > I.e.: > > If I shut down the program in the middle of a pytables > write action, might > > the resulting file be fundamentally corrupted at the hdf5 > level, or would the > > corruption be well localized and reparable? > > I don't really know that. I do know that HDF5 has a quite > sophisticated cache system (both for data and metadata), but > I know nothing about possible corruption scenarios and > neither if they would be reparable or not. That might be a > good question for the HDF5 list. Do you want me to ask this? > Anyway, if you want to ask yourself, the HDF5 list is > hd...@nc.... > > > All leading to the third question: > > > > * Might there be some simple way to make a set of changes > atomic? Even > > if that means restricting the use of pytables? > > By atomic you mean in a way that always let the file > uncorrupted?. I just don't know, it depends on how HDF5 deals > with this situations, but a priori my feeling is negative. > > Cheers, > > -- > Francesc Alted -- _______________________________________________________________________________ The information in this email and in any attachment(s) is commercial in confidence. If you are not the named addressee(s) or if you receive this email in error then any distribution, copying or use of this communication or the information in it is strictly prohibited. Please notify us immediately by email at admin.internet(at)awe.co.uk, and then delete this message from your computer. While attachments are virus checked, AWE plc does not accept any liability in respect of any virus which is not detected. AWE Plc Registered in England and Wales Registration No 02763902 AWE, Aldermaston, Reading, RG7 4PR ------------------------------------------------------- |
From: Francesc A. <fa...@py...> - 2004-09-30 16:53:01
|
Another contribution of Quincey Koziol, one of the HDF5 developers. In next PyTables 0.9 H5Fflush() will be called automatically when File.flush() method is invoked. For 0.8.1 there is a patch for that (see patch #947006 at the pytables project: http://sourceforge.net/tracker/?group_id=63486&atid=504146) Cheers, ---------- Missatge transmes ---------- Subject: Re: Re: [Pytables-users] Danger of file corruption? Date: Dijous 30 Setembre 2004 18:30 From: Quincey Koziol <ko...@nc...> To: Matthew Street <Mat...@aw...> Cc: fa...@py..., hd...@nc... Hi Matt, Francesc, et al. Yes, you are correct - due to the caching that the HDF5 library performs, it is highly likely that a file opened for write access (and actually modified) will be corrupt if the application terminates abruptly. The preferred solution in this case is not to close and re-open the file repeatedly, which will produce poor performance, but to call H5Fflush(), which will syncronize the information on disk with the information cached in memory. It is also possible to turn off the various caches in the library with API calls, but I don't recommend doing this due to the loss in performance. Quincey > My experience has been that the file does indeed become corrupted, and the > data in it unaccessable, if the code is terminated for some reason. We have > found this to happen if the code crashes, is removed by a scheduler, and > also if the filesystem it is writing to becomes temporarily unavailable. > These kinds of issues would affect most file operations though - not just > HDF5 ones. > > A lot of the users I support write timestep dumps, and I always advise > writing a number of dumps containing N timesteps, rather than putting them > all in one, just so that all data is not lost if the worst happens. I > always get them to close and re-open the file between dumps too as leaving > it open can be dangerous - I think the H5Fclose routine needs to complete > properly for the file to be non-corrupt, if this is the case 'atomic'-type > operations probably wont help. > > Hope this helps, > > Matt > > > -- > _______________________________________________________________________ > Matt Street > Parallel Technology Support Tel: +44 (0) 118 982 4528 > Applied Computer Science AWE, Aldermaston, Reading, RG7 4PR. UK. > > > > -----Original Message----- > > From: Francesc Alted > > Sent: 30 September 2004 12:42 > > > > > > Hi, > > > > A PyTables user has sent me couple of questions about HDF5 > > behaviour in case of a sudden sutdown (or crash) of the > > program that is writing a file. > > > > I'm forwarding the questions more related with HDF5, > > toghether with my answers. Can anybody give us a bit of light > > on this issues? > > > > Thanks, > > > > ---------- Missatge transmes ---------- > > > > Subject: Re: [Pytables-users] Danger of file corruption? > > (was: Bug? flush() ignores new tables) > > Date: Dijous 30 Setembre 2004 11:00 > > From: Francesc Alted <fa...@py...> > > To: pyt...@li... > > > > A Dijous 30 Setembre 2004 10:27, Norbert Nemec va escriure: > > > * Are the elementary operations of the underlying hdf5 > > library atomic? > > > I.e.: > > > If I shut down the program in the middle of a pytables > > write action, might > > > the resulting file be fundamentally corrupted at the hdf5 > > level, or would the > > > corruption be well localized and reparable? > > > > I don't really know that. I do know that HDF5 has a quite > > sophisticated cache system (both for data and metadata), but > > I know nothing about possible corruption scenarios and > > neither if they would be reparable or not. That might be a > > good question for the HDF5 list. Do you want me to ask this? > > Anyway, if you want to ask yourself, the HDF5 list is > > hd...@nc.... > > > > > All leading to the third question: > > > > > > * Might there be some simple way to make a set of changes > > atomic? Even > > > if that means restricting the use of pytables? > > > > By atomic you mean in a way that always let the file > > uncorrupted?. I just don't know, it depends on how HDF5 deals > > with this situations, but a priori my feeling is negative. > > > > Cheers, > > > > -- > > Francesc Alted > -- > _______________________________________________________________________________ > > The information in this email and in any attachment(s) is commercial in confidence. If you are not the named addressee(s) or if you receive this email in error then any distribution, copying or use of this communication or the information in it is strictly prohibited. Please notify us immediately by email at admin.internet(at)awe.co.uk, and then delete this message from your computer. While attachments are virus checked, AWE plc does not accept any liability in respect of any virus which is not detected. > > AWE Plc > Registered in England and Wales > Registration No 02763902 > AWE, Aldermaston, Reading, RG7 4PR > ------------------------------------------------------- -- Francesc Alted |
From: Norbert N. <Nor...@gm...> - 2004-09-30 16:44:42
|
Sounds like a fundamental flaw in the design of HDF5 - a high-end data-storage system like HDF5 without protection against file corruption sounds unbelievable to me. The argument that this is the same for any kind of file-storage does not hold here: I you use a journalled filesystem, you know that basic write operations are atomic. Based on that, it is possible with some care to design a file-format and protocol of writing that leaves the file in a well-defined state at any moment. If, in HDF5, the basic operations are not atomic in the same sense, it will never be possible to build a secure system on top of it. Writing additional copies of the file every now and then is, of course, a solution, but it really destroys all the benefits from the high-performance file format HDF5... On Thursday 30 September 2004 17:02, Francesc Alted wrote: > Someone at the list of HDF5 has answered. It looks like the danger of > corruption is real. Matt suggest writing a number of dumps to avoid the > worst to happen. > > Cheers, -- _________________________________________Norbert Nemec Altdorfer Str. 9a ... D-93049 Regensburg Tel: 0941 / 2009638 ... Mobile: 0179 / 7475199 eMail: <No...@Ne...> |
From: Francesc A. <fa...@py...> - 2004-09-30 17:50:49
|
A Dijous 30 Setembre 2004 18:44, Norbert Nemec va escriure: > Sounds like a fundamental flaw in the design of HDF5 - a high-end data-storage > system like HDF5 without protection against file corruption sounds > unbelievable to me. > > The argument that this is the same for any kind of file-storage does not hold > here: I you use a journalled filesystem, you know that basic write operations > are atomic. Based on that, it is possible with some care to design a > file-format and protocol of writing that leaves the file in a well-defined > state at any moment. I agree. The problem is that HDF5 is not journaled, so you cannot be certain that an on-going operation would be atomic. Making HDF5 to support journaled files would be a nice idea, but the problem is who is willing to do that. Anyway, I think it might be better to debate this on HDF5 list, rather than here. Cheers, -- Francesc Alted |