|
From: Peter M. <Pet...@re...> - 2006-04-19 20:18:44
|
Prof. Robert John Lancashire wrote: > Hi all, > > Please excuse me while I try to come to grips with the design of > AnIML documents. I need some pointers..... sometimes the dumb questions are the smart ones. > > I have been looking at a method of parsing info out of AnIML > files using StAX (the Streaming API for XML that uses pull-based > methods). > This allows the user to read large documents with minimal overhead > which I figured was going to be an essential ingredient when we have > nD spectra or hyphenated spectral data. Nobody has responded to my plea for suggestions on the issue of nD mixed data. Does this mean we should make something up and submit it as a new technique specification. FTIR and dispersive Raman on the same reaction plus temperature and pressure data. One datapoint per minute for three days. Plus post processing including absorbance trendlines, deconvoluted peak area trend lines plus synthetic spectra and their loading v's time plots from PCA/ITTR analysis. This is a real case that we have to deal with it is not hypothetical. > > > I originally started with DOM routines (courtesy of Stuart) for reading > AnIML files for the JSpecView project but now find that there are issues > when using the Applet on a web page that seem to stop the document being > parsed even for small files. I am still not totally clear as to what > the issue > is but think it may be due to having to hold the whole file in memory > at once. > Anyway this led me to look at StAX as an alternative. (We already use > streaming methods for reading JCAMP-DX files) > > When attempting to process the document to find all the essential > variables and information needed to be able to plot the data, it seems > that the first thing that needs to be known is NOT what the SAMPLE > is but rather what the TECHNIQUE is. This then determines which > critical technique dependent parameters we need to look for. > > StAX parses through in one direction only and does not allow you to move > backwards so it would suggest that if the MEASUREMENTDATA and in > particular the EXPERIMENTSTEP came first then it would make it easier > to parse through the file to find things like NMR Frequencies etc. > Otherwise you need to store all the info somewhere and read it again. > > What this implies is that the SAMPLESET node needs to come after > the MEASUREMENTDATA node. > > This would solve the problem I have with SAMPLE info as well in that > at present there is no way of knowing which is SAMPLE/BLANK/REFERENCE > etc when processing the SAMPLESET since that information is in the > MEASUREMENTDATA under the SAMPLESUSED/SAMPLESET. It still needs > to be welll defined and I am not sure whether > role="SampleMeasurement" is a predefined STRING or free TEXT. > > Any help appreciated > > Robert > > > > > > -- Dr. Peter J. Melling PhD Remspec Corporation Charlton, MA 01508 Ph. +1 508 248-1462 Fax +1 508 248-1463 |