From: Darren K. <Dar...@cs...> - 2008-11-25 17:43:46
|
Yes, in MSDataCache I've traded flexibility for simple access to common spectrum metadata, where 'common' == stuff I need to report regularly in the various command line tools. Since I'm using it all the time, it's much easier to have it all in a structure (SpectrumInfo) rather than drilling down to find the right CVParam. I agree that there is not a significant performance gain -- the advantage is in the speed with which I can add new functionality for our lab. Another benefit is that the code is *much* simpler to read, and if there is a change in parsing a given field, the change can be made in a single place. I'm not opposed to having multiple simple interfaces like MSDataCache, to be used in different circumstances. SpectrumLists are dependent on an MSData object to hold not just InstrumentConfiguration but also ParamGroups and DataProcessing (anything that a Spectrum can refer to). We could have a convention about passing the inner SpectrumList explicitly, but we shouldn't restrict what other sorts of references the SpectrumList keeps, whether it is some configuration structure, some sort of database, or an MSData object. What problem is this causing for SeeMS? Clients of SpectrumList shouldn't care what internal references it has. Darren On Nov 25, 2008, at 7:47 AM, Matthew Chambers wrote: Hi Darren, I'm trying to put the recalculator into SeeMS but I don't think it'll work with the current design of the SeeMS data processing layer, which works exclusively with SpectrumLists (or ChromatogramLists) and ignores the MSData component. It's very convenient to be able to rely on a nested sequence of SpectrumList_ wrappers acting as a single interface. So, this actually brings up several issues: - SpectrumList_PrecursorRecalculator constructs with an MSData object because it uses an MSDataCache * it uses MSDataCache because it uses the same MS1 for multiple MS2 adjustments * it also needs instrument configuration data, BUT it can get that from the SpectrumPtrs' instrumentConfigurationRef - MSDataCache is rather inflexible because it eliminates the parameter searching which allows msdata::Spectrums to contain variable metadata; i.e. non-MS spectra won't have msLevel or any mz fields, non-Thermo spectra won't have thermoMonoisotopicMZ or filterString, etc. * it works great on data from RAW files or mzML from RAW files, but it has a lot of n/a fields for other data * you may remember I suggested a similar concept way back where such a structure could be autogenerated from the mapping file and CV, but I neglected to follow up by saying I abandoned the idea because of the variable parameter expression problem :) * I hypothesize that caching the binary data base64 decoding results in far more CPU savings than the parameter searching - We still need cache control on the underlying file-based SpectrumList implementations * previously we discussed "on" and "off" where "on" is metadata only caching, but I now propose three levels: on for metadata, on for metadata and binary data, and off * won't this eliminate the need for MSDataCache entirely (assuming we can give up the parameter caching due to the variable expression problem)? * we can implement it with the same LRU convention and provide cache configuration and clearing methods - Should we agree on and document a convention that SpectrumListWrappers must construct with a SpectrumList-based constructor (it may use additional arguments, but the inner_ is always set by the SpectrumList)? I wish we could enforce it with an interface, but I don't know if that's possible. -Matt IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is STRICTLY PROHIBITED. If you have received this message in error, please notify us immediately by calling (310) 423-6428 and destroy the related message. Thank You for your cooperation. IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is STRICTLY PROHIBITED. If you have received this message in error, please notify us immediately by calling (310) 423-6428 and destroy the related message. Thank You for your cooperation. |