[Psidev-ms-dev] Using mzData in Software Applications

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Dear all,

as the project manager of, Proteios (http://www.proteios.org/) - a=20
bioinformatics software project in proteomics, I strongly greet the=20
standardization work for mzData. We are currently working on integrating=20
mzData in our (Proteios) data model.

Has anyone done this before - actually used mzData in a software applicatio=
n=20
(apart from validation and the like with standard XML tools)? Putting mzDat=
a=20
into practical use should be very much in the interest of PSI. Unfortunatel=
y=20
there are, in my opinion, a few things which cause (uneccessary?)=20
difficulties in an implementation.

=46irst I would suggest not using common programming language keywords (e.g=
=20
"float") as element/attribute names. This is a possible source of confusion=
=20
and makes simple mapping onto source code data structures more difficult.

Secondly, I wonder wether there is any pressing need for keeping two separa=
te=20
arrays in cases where the apparent meaning is rather one single array of=20
pairs (e.g. "intenArray", "mzArray"). The schema does not enforce the same=
=20
length on these two arrays. I also wonder what the purpose of the attribute=
=20
length is. Can't it be removed, since the length is implicitely given by th=
e=20
number of subelements? Consider the following excerpt from a valid(!) mzDat=
a=20
XML-file:

        <mzArray length=3D"15">
          <float>100</float>
          <float>500</float>
        </mzArray>
        <intenArray length=3D"7">
          <float>10</float>
          <float>100</float>
          <float>20</float>
        </intenArray>

Kind Regards,
Per G=E4rd=E9n

=2D-=20
Per G=E4rd=E9n
Lund Swegene Bioinformatics Platform
Complex Systems Division
Lund University
Sweden
=20
phone: +46 46 2229229
fax:=20
e-mail: pe...@th...=20
=20