From: Egon W. <ego...@gm...> - 2008-07-30 08:19:57
|
On Wed, Jul 30, 2008 at 10:15 AM, Ola Spjuth <ola...@fa...> wrote: > In Bioclipse (http://www.bioclipse.net) we need a fast way of inspecting a > file and find out if it contains 2D or 3D coordinates, hence no parsing but > only looking at a few early elements. The method does not have to be > foolproof, but must be fast. > What are the ways of deducing if the file contains a molecule with 2D or 3D? > We only care about files with 1-N molecules currently, no mixed content > (like spectra as well in same file). > I assume the top element is <molecule> with > namespace=http://www.xml-cml.org/schema. > Looking at a file with 2D coordinates, it starts like this: > <atomArray> > <atom id="a1" elementType="C" x2="1.73" y2="-2.5"/> > ... > Looking at a CML file with 3D coordinates it starts like this: > <atomArray> > <atom id="a1" elementType="C" x3="3.249900" y3="-0.070900" > z3="-0.279200"/> Yes, at the attribute level the difference is x2,y2 versus x3,y3,z3. > Would you agree with me that looking at the first <atom> and see if it > contains x2 and/or x3 is sufficient to make a good guess if the file > contains 2D and/or 3D coordinates? Yes, that should do. I quick scan of the CML2.5b1 schema seem to indicate that x2 and x3 are only used for atomic coordinates, which would mean you would not even have to scan for the presence inside <atom>... Egon -- ---- http://chem-bla-ics.blogspot.com/ |