From: Geer, L. (NIH/NLM/NCBI) <le...@nc...> - 2005-06-21 16:37:42
|
Hi, I've only recently read the mzData standard and don't know what state it is in, so I hope you will excuse the following comments. From an implementation viewpoint, the use of the IEEE-754 floating point representation to specify the spectra creates several issues: 1. Uneven implementation of IEEE-754 on various platforms: For example, IEEE-754 allows "denormalized" floating point values. It is my understanding that this subset of the standard is implemented on some platforms/compilers (e.g. windows) and not on others (e.g. mac). So unless an extensive conversion API is written, a float point exception or other problems could arise from files moved across platform. 2. The IEEE-754 encoded spectral data can not be validated by an XML parser and is not directly accessible from XML parsing api's. As far as an XML parser knows, the spectral data is an opaque blob. If the data were stored as XML floats or doubles, the XML parser could validate the data and provide quick access to the individual peaks. Otherwise the validation and access is pushed back into a custom API that may not be readily available and may have to be written from scratch. 3. A human can't read the IEEE-754 blob. This may seem a trivial issue, but it *really* helps in debugging, particularly with end users, to be able to read the file. Obviously, putting the data in XML float or double format will address these issues at the expense of file size. Gzipping XML can be a practical way to minimize the file size. Regards, Lewis |