You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(3) |
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
(3) |
Dec
|
2004 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
(1) |
Aug
(5) |
Sep
|
Oct
(5) |
Nov
(1) |
Dec
(2) |
2005 |
Jan
(2) |
Feb
(5) |
Mar
|
Apr
(1) |
May
(5) |
Jun
(2) |
Jul
(3) |
Aug
(7) |
Sep
(18) |
Oct
(22) |
Nov
(10) |
Dec
(15) |
2006 |
Jan
(15) |
Feb
(8) |
Mar
(16) |
Apr
(8) |
May
(2) |
Jun
(5) |
Jul
(3) |
Aug
(1) |
Sep
(34) |
Oct
(21) |
Nov
(14) |
Dec
(2) |
2007 |
Jan
|
Feb
(17) |
Mar
(10) |
Apr
(25) |
May
(11) |
Jun
(30) |
Jul
(1) |
Aug
(38) |
Sep
|
Oct
(119) |
Nov
(18) |
Dec
(3) |
2008 |
Jan
(34) |
Feb
(202) |
Mar
(57) |
Apr
(76) |
May
(44) |
Jun
(33) |
Jul
(33) |
Aug
(32) |
Sep
(41) |
Oct
(49) |
Nov
(84) |
Dec
(216) |
2009 |
Jan
(102) |
Feb
(126) |
Mar
(112) |
Apr
(26) |
May
(91) |
Jun
(54) |
Jul
(39) |
Aug
(29) |
Sep
(16) |
Oct
(18) |
Nov
(12) |
Dec
(23) |
2010 |
Jan
(29) |
Feb
(7) |
Mar
(11) |
Apr
(22) |
May
(9) |
Jun
(13) |
Jul
(7) |
Aug
(10) |
Sep
(9) |
Oct
(20) |
Nov
(1) |
Dec
|
2011 |
Jan
|
Feb
(4) |
Mar
(27) |
Apr
(15) |
May
(23) |
Jun
(13) |
Jul
(15) |
Aug
(11) |
Sep
(23) |
Oct
(18) |
Nov
(10) |
Dec
(7) |
2012 |
Jan
(23) |
Feb
(19) |
Mar
(7) |
Apr
(20) |
May
(16) |
Jun
(4) |
Jul
(6) |
Aug
(6) |
Sep
(14) |
Oct
(16) |
Nov
(31) |
Dec
(23) |
2013 |
Jan
(14) |
Feb
(19) |
Mar
(7) |
Apr
(25) |
May
(8) |
Jun
(5) |
Jul
(5) |
Aug
(6) |
Sep
(20) |
Oct
(19) |
Nov
(10) |
Dec
(12) |
2014 |
Jan
(6) |
Feb
(15) |
Mar
(6) |
Apr
(4) |
May
(16) |
Jun
(6) |
Jul
(4) |
Aug
(2) |
Sep
(3) |
Oct
(3) |
Nov
(7) |
Dec
(3) |
2015 |
Jan
(3) |
Feb
(8) |
Mar
(14) |
Apr
(3) |
May
(17) |
Jun
(9) |
Jul
(4) |
Aug
(2) |
Sep
|
Oct
(13) |
Nov
|
Dec
(6) |
2016 |
Jan
(8) |
Feb
(1) |
Mar
(20) |
Apr
(16) |
May
(11) |
Jun
(6) |
Jul
(5) |
Aug
|
Sep
(2) |
Oct
(5) |
Nov
(7) |
Dec
(2) |
2017 |
Jan
(10) |
Feb
(3) |
Mar
(17) |
Apr
(7) |
May
(5) |
Jun
(11) |
Jul
(4) |
Aug
(12) |
Sep
(9) |
Oct
(7) |
Nov
(2) |
Dec
(4) |
2018 |
Jan
(7) |
Feb
(2) |
Mar
(5) |
Apr
(6) |
May
(7) |
Jun
(7) |
Jul
(7) |
Aug
(1) |
Sep
(9) |
Oct
(5) |
Nov
(3) |
Dec
(5) |
2019 |
Jan
(10) |
Feb
|
Mar
(4) |
Apr
(4) |
May
(2) |
Jun
(8) |
Jul
(2) |
Aug
(2) |
Sep
|
Oct
(2) |
Nov
(9) |
Dec
(1) |
2020 |
Jan
(3) |
Feb
(1) |
Mar
(2) |
Apr
|
May
(3) |
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
(1) |
2021 |
Jan
|
Feb
|
Mar
|
Apr
(5) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2023 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2024 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(2) |
2025 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Darren K. <Dar...@cs...> - 2008-07-15 18:16:42
|
Hey all, I meant to mention this last week, but I've been busy... Anyway, we have an Application Note about ProteoWizard in Bioinformatics: http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btn323 Darren IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is STRICTLY PROHIBITED. If you have received this message in error, please notify us immediately by calling (310) 423-6428 and destroy the related message. Thank You for your cooperation. |
From: Darren K. <dke...@ya...> - 2008-07-09 18:44:40
|
Alex, I've moved this discussion to the proteowizard-support list. Please see the other message to you. Darren On Jul 9, 2008, at 11:41 AM, Georgiou, A. wrote: > Hi Darren, > > Apologies for starting a new thread for the same issue (I had my > settings in digest mode and couldn't reply directly to your email!). > I tried the new version of msconvert.exe and I still get the same > problems. After some more testing I have reached some conclusions: > > The problem has nothing to do with 32-bit vs 64-bit encodings but > only appears with zlib files. Big chunks of bytes might get > decompressed correctly and then other chunks become garbage. This > behavior appears to be random: with some arrays it may work ok, and > then it garbles parts of other arrays. Like I said before, if it > affects an m/z array then it can also affect the intensity array at > the same locations, so I am not sure it is a problem with zlib, > might be another kind of bug in the program's logic. Just in case > perhaps it would be useful to make sure that the standard jre > implementation of zlib can handle these files. Are you using any > special settings/tables with the deflater by any chance? > > Perhaps tomorrow I can post a java snippet that reproduces the > problem with some example base64 strings from the files. Would > anyone be interested in looking at this? > > Still, I think it's much more likely that the error has to do with > the way msconvert handles the buffers to or from the deflater. Some > kind of memory allocation problem? > > regards, > Alex > > ------------------------------------- > Alexandros Georgiou > Department of Biomolecular Mass Spectrometry > Utrecht University > Sorbonnelaan 16 (room N319) > 3584CA Utrecht > The Netherlands > > http://www.pharm.uu.nl/ffwuk.htm?/bioms/ > http://proteomics.chem.uu.nl/ > > > > ------------------------------------------------------------------------- > Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! > Studies have shown that voting for your favorite open source project, > along with a healthy diet, reduces your potential for chronic lameness > and boredom. Vote Now at http://www.sourceforge.net/community/cca08_______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Georgiou, A. <A.G...@uu...> - 2008-07-09 16:39:05
|
Hi Darren, Apologies for starting a new thread for the same issue (I had my settings in digest mode and couldn't reply directly to your email!). I tried the new version of msconvert.exe and I still get the same problems. After some more testing I have reached some conclusions: The problem has nothing to do with 32-bit vs 64-bit encodings but only appears with zlib files. Big chunks of bytes might get decompressed correctly and then other chunks become garbage. This behavior appears to be random: with some arrays it may work ok, and then it garbles parts of other arrays. Like I said before, if it affects an m/z array then it can also affect the intensity array at the same locations, so I am not sure it is a problem with zlib, might be another kind of bug in the program's logic. Just in case perhaps it would be useful to make sure that the standard jre implementation of zlib can handle these files. Are you using any special settings/tables with the deflater by any chance? Perhaps tomorrow I can post a java snippet that reproduces the problem with some example base64 strings from the files. Would anyone be interested in looking at this? Still, I think it's much more likely that the error has to do with the way msconvert handles the buffers to or from the deflater. Some kind of memory allocation problem? regards, Alex ------------------------------------- Alexandros Georgiou Department of Biomolecular Mass Spectrometry Utrecht University Sorbonnelaan 16 (room N319) 3584CA Utrecht The Netherlands http://www.pharm.uu.nl/ffwuk.htm?/bioms/ http://proteomics.chem.uu.nl/ |
From: Darren K. <dke...@ya...> - 2008-07-09 14:37:30
|
Hi Alex, We had a problem with the zlib library we were using, and I fixed this about a month ago. However, I may have forgotten to update the example converted files. Also, are you using the msconvert in the 1.2 pwiz release? You can obtain the latest msconvert.exe here: http://proteowizard.svn.sourceforge.net/viewvc/*checkout*/proteowizard/trunk/pwiz/bin/windows/i386/msconvert.exe Please let me know if this fixes your problem. Thanks, Darren Georgiou, A. wrote: > > Hello everyone, this is my first post on this list and I'd like to > share my experiences trying to write java code that will decode binary > arrays. I believe I have found a couple of errors that need correcting. > > First of all, I would like to say that during the last few days I have > tried as hard as I could to eliminate the possibility of bugs in my > code, but there is always a possibility that the mistake is on my > part. Having said this, I believe that the data in the > small_zlib.pwiz.mzML file is invalid. After I decompress the data with > ZLIB (the java.util.zip.Inflater class does not complain so it seems > the compression format is OK), there is no way I can unpack the bytes > to sensible (always positive) 32-bit floats, no matter which byte > ordering I try (according to the docs, it should be little-endian). My > decompressor works OK with ZLIB'd 64-bit values so I don't think ZLIB > is causing the problem. > > So I used msconvert to convert small.raw to 4 files: > small_32bit_nozlib.mzML, small_32bit_zlib.mzML, > small_64bit_nozlib.mzML, and small_64bit_zlib.mzML. I have no problem > decoding the two 64-bit files, and also the uncompressed 32-bit file > now works. As for 32-bit with ZLIB, *most* of the values I get are the > same as those I get from the other files. The exception is that the > first 68 32-bit values in the mz array are copied from the end of the > array (last 68 values from the array are also copied to the beginning > of the array) and the corresponding first 68 32-bit intensities in the > intensity array appear to be garbage. I have been debugging by code > for a couple of days now and I think there is a good chance that the > error is in protein wizard and not my code. > > In any case, whether there is a bug in protein wizard or not, I feel > pretty sure that there is something wrong with the example file > "small_zlib.pwiz.mzML" so maybe someone might want to check it again. > I believe that the two problems are actually one: There must be a bug > in protein wizard when it writes to ZLIB compressed 32-bit data. > Possibly something to do with buffer/array indices. Any help/comments > on this will be appreciated. > > thanks, > Alex > > ------------------------------------- > Alexandros Georgiou > Department of Biomolecular Mass Spectrometry > Utrecht University > Sorbonnelaan 16 (room N319) > 3584CA Utrecht > The Netherlands > > http://www.pharm.uu.nl/ffwuk.htm?/bioms/ > http://proteomics.chem.uu.nl/ > > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! > Studies have shown that voting for your favorite open source project, > along with a healthy diet, reduces your potential for chronic lameness > and boredom. Vote Now at http://www.sourceforge.net/community/cca08 > ------------------------------------------------------------------------ > > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > |
From: Matt C. <mat...@va...> - 2008-07-09 13:16:06
|
Yes, that is allowed. I recommend it in fact, because as you say it saves lots of space. :) -Matt Thorsten Schramm wrote: > Hello everyone! > > The mass spectra in my mzML file have all the same data structure: > > <binaryDataArray encodedLength="160" > dataProcessingRef="XcaliburProcessing" > > <cvParam cvLabel="MS" accession="MS:1000521" name="64-bit float" > value="" /> > <cvParam cvLabel="MS" accession="MS:1000576" name="no compression" > value="" /> > <cvParam cvLabel="MS" accession="MS:1000514" name="m/z array" > value="" /> > <binary>AAA...</binary> > </binaryDataArray> > <binaryDataArray encodedLength="160" > dataProcessingRef="XcaliburProcessing" > > <cvParam cvLabel="MS" accession="MS:1000521" name="32-bit integer" > value="" /> > <cvParam cvLabel="MS" accession="MS:1000576" name="no compression" > value="" /> > <cvParam cvLabel="MS" accession="MS:1000515" name="intensity array" > value="" /> > <binary>AAA...</binary> > </binaryDataArray> > > Is it possible to define two referenceableParamGroup which contain the > cvParams of both of the binaryDataArrays? > > An example: > > <referenceableParamGroup id="mzParamGroup"> > <cvParam cvLabel="MS" accession="MS:1000521" name="64-bit float" > value="" /> > <cvParam cvLabel="MS" accession="MS:1000576" name="no compression" > value="" /> > <cvParam cvLabel="MS" accession="MS:1000514" name="m/z array" > value="" /> > </referenceableParamGroup> > <referenceableParamGroup id="intensityParamGroup"> > <cvParam cvLabel="MS" accession="MS:1000521" name="32-bit integer" > value="" /> > <cvParam cvLabel="MS" accession="MS:1000576" name="no compression" > value="" /> > <cvParam cvLabel="MS" accession="MS:1000515" name="intensity array" > value="" /> > </referenceableParamGroup> > . > . > . > <binaryDataArray encodedLength="160" > dataProcessingRef="XcaliburProcessing" > > <referenceableParamGroupRef ref="mzParamGroup"/> > <binary>AAA...</binary> > </binaryDataArray> > <binaryDataArray encodedLength="160" > dataProcessingRef="XcaliburProcessing" > > <referenceableParamGroupRef ref="intensityParamGroup"/> > <binary>AAA...</binary> > </binaryDataArray> > > In rather huge files this will save some disc space. But I do not know, > if this is allowed. > Is it? > > Thanks in advance > Best wishes > > Thorsten > > |
From: Thorsten S. <Tho...@an...> - 2008-07-09 13:10:41
|
Hello everyone! The mass spectra in my mzML file have all the same data structure: <binaryDataArray encodedLength="160" dataProcessingRef="XcaliburProcessing" > <cvParam cvLabel="MS" accession="MS:1000521" name="64-bit float" value="" /> <cvParam cvLabel="MS" accession="MS:1000576" name="no compression" value="" /> <cvParam cvLabel="MS" accession="MS:1000514" name="m/z array" value="" /> <binary>AAA...</binary> </binaryDataArray> <binaryDataArray encodedLength="160" dataProcessingRef="XcaliburProcessing" > <cvParam cvLabel="MS" accession="MS:1000521" name="32-bit integer" value="" /> <cvParam cvLabel="MS" accession="MS:1000576" name="no compression" value="" /> <cvParam cvLabel="MS" accession="MS:1000515" name="intensity array" value="" /> <binary>AAA...</binary> </binaryDataArray> Is it possible to define two referenceableParamGroup which contain the cvParams of both of the binaryDataArrays? An example: <referenceableParamGroup id="mzParamGroup"> <cvParam cvLabel="MS" accession="MS:1000521" name="64-bit float" value="" /> <cvParam cvLabel="MS" accession="MS:1000576" name="no compression" value="" /> <cvParam cvLabel="MS" accession="MS:1000514" name="m/z array" value="" /> </referenceableParamGroup> <referenceableParamGroup id="intensityParamGroup"> <cvParam cvLabel="MS" accession="MS:1000521" name="32-bit integer" value="" /> <cvParam cvLabel="MS" accession="MS:1000576" name="no compression" value="" /> <cvParam cvLabel="MS" accession="MS:1000515" name="intensity array" value="" /> </referenceableParamGroup> . . . <binaryDataArray encodedLength="160" dataProcessingRef="XcaliburProcessing" > <referenceableParamGroupRef ref="mzParamGroup"/> <binary>AAA...</binary> </binaryDataArray> <binaryDataArray encodedLength="160" dataProcessingRef="XcaliburProcessing" > <referenceableParamGroupRef ref="intensityParamGroup"/> <binary>AAA...</binary> </binaryDataArray> In rather huge files this will save some disc space. But I do not know, if this is allowed. Is it? Thanks in advance Best wishes Thorsten -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Dipl. Geologe Thorsten Schramm Institut für Anorganische und Analytische Chemie Justus-Liebig-Universität Giessen Schubertstrasse 60, Haus 16 D-35390 Giessen Fon: +49 (0)641 - 99 34 821 Fax: +49 (0)641 - 99 34 809 |
From: Georgiou, A. <A.G...@uu...> - 2008-07-09 12:14:11
|
Hello everyone, this is my first post on this list and I'd like to share my experiences trying to write java code that will decode binary arrays. I believe I have found a couple of errors that need correcting. First of all, I would like to say that during the last few days I have tried as hard as I could to eliminate the possibility of bugs in my code, but there is always a possibility that the mistake is on my part. Having said this, I believe that the data in the small_zlib.pwiz.mzML file is invalid. After I decompress the data with ZLIB (the java.util.zip.Inflater class does not complain so it seems the compression format is OK), there is no way I can unpack the bytes to sensible (always positive) 32-bit floats, no matter which byte ordering I try (according to the docs, it should be little-endian). My decompressor works OK with ZLIB'd 64-bit values so I don't think ZLIB is causing the problem. So I used msconvert to convert small.raw to 4 files: small_32bit_nozlib.mzML, small_32bit_zlib.mzML, small_64bit_nozlib.mzML, and small_64bit_zlib.mzML. I have no problem decoding the two 64-bit files, and also the uncompressed 32-bit file now works. As for 32-bit with ZLIB, *most* of the values I get are the same as those I get from the other files. The exception is that the first 68 32-bit values in the mz array are copied from the end of the array (last 68 values from the array are also copied to the beginning of the array) and the corresponding first 68 32-bit intensities in the intensity array appear to be garbage. I have been debugging by code for a couple of days now and I think there is a good chance that the error is in protein wizard and not my code. In any case, whether there is a bug in protein wizard or not, I feel pretty sure that there is something wrong with the example file "small_zlib.pwiz.mzML" so maybe someone might want to check it again. I believe that the two problems are actually one: There must be a bug in protein wizard when it writes to ZLIB compressed 32-bit data. Possibly something to do with buffer/array indices. Any help/comments on this will be appreciated. thanks, Alex ------------------------------------- Alexandros Georgiou Department of Biomolecular Mass Spectrometry Utrecht University Sorbonnelaan 16 (room N319) 3584CA Utrecht The Netherlands http://www.pharm.uu.nl/ffwuk.htm?/bioms/ http://proteomics.chem.uu.nl/ |
From: Brian P. <bri...@in...> - 2008-07-08 18:31:07
|
Yeah, I don't think you're going to see raw data being loaded into Excel. presumably they meant Sequest results. Brian _____ From: psi...@li... [mailto:psi...@li...] On Behalf Of Darren Kessner Sent: Tuesday, July 08, 2008 11:02 AM To: Mass spectrometry standard development Subject: Re: [Psidev-ms-dev] Open XML Sorry -- forgot that you need to be registered. Text pasted here: BioIT Alliance Continues to Grow June 2008 Launched in 2006 with 15 members, the Microsoft-led BioIT Alliance has grown to 77 members and plans to expand its scope in a world in which the road to interoperability for users or vendors is not yet fully mapped. At the recent Bio-IT World Conference, Rudy Potenzone, director of the BioIT Alliance and industry technology strategist for pharmaceuticals at Microsoft, said that one way the initiative plans to expand in the year ahead is by launching a SharePoint portal that will let users build and try out Web components. As an example, Les Jordan, Microsoft's life science industry technology strategist, pointed out during a conference session that Thermo Fisher Scientific recently announced it was starting to enable results from its lab equipment to be output into the Open XML format. With Open XML, he said, scientists can pull data off an instrument via a Web service and import it directly into the "scientist's favorite tool," Excel. It could then travel onto a high-performance computing cluster, while users can access the data in a portal where it can be viewed, searched, and shared. "It is sitting in an open standard and people can access it though a Web service," said Jordan. "This is going to allow people to innovate." Richard LeDuc, co-director of Washington University School of Medicine's proteomics and mass-spectrometry core facility, voiced concerns about applying this idea in his facility because the conversion from Thermo's .RAW mass spectrometry files to Open XML files dramatically increases their size. He described himself as a heavy Microsoft user who also develops in the Microsoft environment. Although he has not transformed files to Open XML, in the past he and his colleagues have written code to pull mass spectrometry data off instruments and to transform the proprietary .RAW binary files into XML. "They tend to explode, in my experience, on the order of threefold," he said. -Vivien Marx On Jul 8, 2008, at 8:01 AM, Darren Kessner wrote: Hi guys, I saw this little blurb on Thermo's plans to export data to Microsoft's Open XML for use in Excel: http://www.genome-technology.com/issues/2_15/techspotlight/147264-1.html Jim, do you have any more details on this? Darren ------------------------------------------------------------------------- Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 _______________________________________________ Psidev-ms-dev mailing list Psi...@li... https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev Darren Kessner Scientific Programmer Dar...@cs... 310-423-9538 Spielberg Family Center for Applied Proteomics Cedars-Sinai Medical Center Los Angeles, CA http://www.sfcap.cshs.org/ http://proteowizard.sourceforge.net/ IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is STRICTLY PROHIBITED. If you have received this message in error, please notify us immediately by calling (310) 423-6428 and destroy the related message. Thank You for your cooperation. |
From: Darren K. <Dar...@cs...> - 2008-07-08 18:02:18
|
Sorry -- forgot that you need to be registered. Text pasted here: BioIT Alliance Continues to Grow June 2008 Launched in 2006 with 15 members, the Microsoft-led BioIT Alliance has grown to 77 members and plans to expand its scope in a world in which the road to interoperability for users or vendors is not yet fully mapped. At the recent Bio-IT World Conference, Rudy Potenzone, director of the BioIT Alliance and industry technology strategist for pharmaceuticals at Microsoft, said that one way the initiative plans to expand in the year ahead is by launching a SharePoint portal that will let users build and try out Web components. As an example, Les Jordan, Microsoft's life science industry technology strategist, pointed out during a conference session that Thermo Fisher Scientific recently announced it was starting to enable results from its lab equipment to be output into the Open XML format. With Open XML, he said, scientists can pull data off an instrument via a Web service and import it directly into the "scientist's favorite tool," Excel. It could then travel onto a high-performance computing cluster, while users can access the data in a portal where it can be viewed, searched, and shared. "It is sitting in an open standard and people can access it though a Web service," said Jordan. "This is going to allow people to innovate." Richard LeDuc, co-director of Washington University School of Medicine's proteomics and mass-spectrometry core facility, voiced concerns about applying this idea in his facility because the conversion from Thermo's .RAW mass spectrometry files to Open XML files dramatically increases their size. He described himself as a heavy Microsoft user who also develops in the Microsoft environment. Although he has not transformed files to Open XML, in the past he and his colleagues have written code to pull mass spectrometry data off instruments and to transform the proprietary .RAW binary files into XML. "They tend to explode, in my experience, on the order of threefold," he said. —Vivien Marx On Jul 8, 2008, at 8:01 AM, Darren Kessner wrote: > Hi guys, > > I saw this little blurb on Thermo's plans to export data to > Microsoft's > Open XML for use in Excel: > http://www.genome-technology.com/issues/2_15/techspotlight/147264-1.html > > Jim, do you have any more details on this? > > > Darren > > > > > > ------------------------------------------------------------------------- > Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! > Studies have shown that voting for your favorite open source project, > along with a healthy diet, reduces your potential for chronic lameness > and boredom. Vote Now at http://www.sourceforge.net/community/cca08 > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev Darren Kessner Scientific Programmer Dar...@cs... 310-423-9538 Spielberg Family Center for Applied Proteomics Cedars-Sinai Medical Center Los Angeles, CA http://www.sfcap.cshs.org/ http://proteowizard.sourceforge.net/ IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is STRICTLY PROHIBITED. If you have received this message in error, please notify us immediately by calling (310) 423-6428 and destroy the related message. Thank You for your cooperation. |
From: Darren K. <dke...@ya...> - 2008-07-08 15:01:01
|
Hi guys, I saw this little blurb on Thermo's plans to export data to Microsoft's Open XML for use in Excel: http://www.genome-technology.com/issues/2_15/techspotlight/147264-1.html Jim, do you have any more details on this? Darren |
From: Eric D. <ede...@sy...> - 2008-07-07 17:29:14
|
Hi everyone, here are the minutes for the PSI-MSS WG telecon of 2008-07-07. Present: Matt, Lennart, Jim, Darren, Pierre-Alain, Eric - mzML support information table + Is now posted + Add TOPP + Add NCBI C++ toolkit + Phenyx, and InSilicoSpectro listing is fine as is. - CV status + Darren will check what Matt sent and then check it in + Eric received CV from David Sparkman + Lennart mentions that "% percent collision energy" is a child MS term under UO parent + It is considered bad practice to add our own terms under someone else's ontology + The UO ontology maintainer appears happy to add our own strange unit terms + UO has some characters that OBO-Edit hates + Get in touch with the maintainer to fix some things in the UO + Darren will make a list of things to fix in UO and email to vocab list + Also fix all the warnings when OBO-Edit tries to save + Matt email: redo, removing duplicates. No lists, but do include the singular concepts and repropose + Darren will remove all entries: synonym: "<new synonym>" RELATED [] + Not sure what to do with Wilfred's suggestions. Ask how he wants to use those. + Matt will remove extraneous \n in definitions + Consider that we may need to have a parent term to contain (m/z, mass, ppm) but not yet + Matt will update the definition of scan time: start offset time of beginning of the scan + Let's NOT check in the new terms that Matt is proposing yet. They should not be all in root. + Lennart will send out a possible placement of Matt's terms. + Darren will check in minor fixing + Darren will then email to Lennart more controversial changes - validator + Lennart and company will start working on this again this week + Lennart and Matt will work on updated rules + There are many back-end changes due to the MI group + Does the validator check that if there is an acquisition list, then there would be a combination type (like sum of spectra) provided? + We will allow spectrum representation (like centroid/profile) in the fileContent section + Need to incorporate the information in the OBO file (datatypes and units) + Need to fix up the mapping file rules + Maybe need a rule that if a spectrum and if a mass spectrum, the need mz array and similar + Need some sort of unit for intensity arrays + At the moment, we do not require any units for the array type. We should add that? + Darren will explore that while adding has_units to CV - Issue of generic binary array CV term issue + Marc Sturm wants to include his own custom binary data array types. How should this be done? + Possible suggestion: there should be a special TOPP CV that they maintain? + TOPP CV. These new terms live in TOPP CV, but also linked to MS binary data array type + Lennart will ask Luisa et al. about this issue - CV template updates to vendors + What's up with Thermo CVTerms and instrument attributes Excel sheet? + It helped Matt hand craft some nice code + Could have a separate CV for all this information. Or just a tsv/Excel file + Start another round of vendor updates - schema + No major schema changes planned + Let's keep a list of minor possibly desirable changes to make if there needs to be an important change + One desirable addition is retention time in the index + Current examples show: xmlns="http://psi.hupo.org/schema_revision/mzML_1.0.0" What should it be instead? Perhaps: xmlns="http://psidev.info/schemas/mzML_1.0.0" Poll SG to get an answer - Johannes Junker asks about supDataArrays. Should be fine as is. Ask for clarification - MIAPE example document? + Done and posted except for switching criteria and parameters for creating peaklists + Pierre-Alain and Jim will get together and finish this - documentation + Documentation is up to date - website + Post the mapping file directly + Should we set up an mzML.org? + Instead of maintaining a separate page, set up a redirect? - example files + Matt will take a stab at MALDI example and post it + Eric will remove tiny4 and point to "small" which includes LTQ-FT demo usage + Fredrik sent to Eric nice examples of dta -> mzML and plgs -> mzML. Eric will post. + Matt will generate an example MGF -> mzML using pwiz and post + Jim will send Eric & Matt some example RAW files and converted files (including PDA, SRM) - Next call + Next call same time in two weeks on July 21 |
From: Marc S. <st...@in...> - 2008-07-07 13:34:22
|
Hi Angel, we will have our own parser for two reasons 1) to avoid the overhead of converting the pwiz data structures to our data structures 2) to avoid another dependency (we already have a bunch) Best, Marc Angel Pizarro wrote: > Is the parser pwiz based or your own code? -angel |
From: Angel P. <an...@ma...> - 2008-07-07 12:53:30
|
awesome. Is the parser pwiz based or your own code? -angel On Fri, Jul 4, 2008 at 5:13 AM, Marc Sturm < st...@in...> wrote: > Hi all, > > attached is a list of TOPP tools that will be able to process mzML (as soon > as we have a stable implementation). > I guess a "TOPP software" subsection in the "software" section would be > best. > > Thanks in advance to whoever adds our tools to the CV. > > Best, > Marc > > TOPP software -- TOPP (The OpenMS proteomics pipeline) software for mass > spectrometry. > |- BaselineFilter -- Removes the baseline from profile spectra using a > top-hat filter. > |- DBExporter -- Exports data from an OpenMS database to a file. > |- DBImporter -- Imports data to an OpenMS database. > |- FileConverter -- Converts between different MS file formats. > |- FileFilter -- Extracts or manipulates portions of data from peak, > feature or consensus feature files. > |- FileMerger -- Merges several MS files into one file. > |- InternalCalibration -- Applies an internal calibration. > |- MapAligner -- Corrects retention time distortions between maps. > |- MapNormalizer -- Normalizes peak intensities in an MS run. > |- NoiseFilter -- Removes noise from profile spectra by using different > smoothing techniques. > |- PeakPicker -- Finds mass spectrometric peaks in profile mass spectra. > |- Resampler -- Transforms an LC/MS map into a resampled map or a png > image. > |- SpectraFilter -- Applies a filter to peak spectra. > |- TOFCalibration -- Applies time of flight calibration. > > > > > |
From: Eric D. <ede...@sy...> - 2008-07-07 05:41:19
|
Hi everyone, the PSI Mass Spectrometry Standards Working Group call is Monday July 7 at 9am PDT: http://www.timeanddate.com/worldclock/fixedtime.html?day=7&month=7&year= 2008&hour=17&min=0&sec=0&p1=136 + Germany: 08001012079 + Switzerland: 0800000860 + UK: 08081095644 + USA: 1-866-314-3683 + Generic international: +44 2083222500 (UK number) access code: 297427 The agenda will be to review some recent discussions, the mzML implementers table and other items about the CV and such. Topics: - MIAPE example document - Other example documents - CV - CV template updates to vendors - Documentation - Web site - mzML support information table - Validator - Amsterdam - Next call Thanks, Eric |
From: Marc S. <st...@in...> - 2008-07-04 09:13:17
|
Hi all, attached is a list of TOPP tools that will be able to process mzML (as soon as we have a stable implementation). I guess a "TOPP software" subsection in the "software" section would be best. Thanks in advance to whoever adds our tools to the CV. Best, Marc TOPP software -- TOPP (The OpenMS proteomics pipeline) software for mass spectrometry. |- BaselineFilter -- Removes the baseline from profile spectra using a top-hat filter. |- DBExporter -- Exports data from an OpenMS database to a file. |- DBImporter -- Imports data to an OpenMS database. |- FileConverter -- Converts between different MS file formats. |- FileFilter -- Extracts or manipulates portions of data from peak, feature or consensus feature files. |- FileMerger -- Merges several MS files into one file. |- InternalCalibration -- Applies an internal calibration. |- MapAligner -- Corrects retention time distortions between maps. |- MapNormalizer -- Normalizes peak intensities in an MS run. |- NoiseFilter -- Removes noise from profile spectra by using different smoothing techniques. |- PeakPicker -- Finds mass spectrometric peaks in profile mass spectra. |- Resampler -- Transforms an LC/MS map into a resampled map or a png image. |- SpectraFilter -- Applies a filter to peak spectra. |- TOFCalibration -- Applies time of flight calibration. |
From: Pierre-Alain B. <pie...@is...> - 2008-06-27 08:24:41
|
Hi, in general terms, userParams are sets of params that are difficult to align in a common CV and that might be tools specific. In order to comply with MIAPE (and in more general term with the idea that the provided information should be sufficient to understand how the data are obtained, in technical terms), tools are knowing themselves what are the relevant and important parameters to provide. It is therefore the responsibility of the tool provider to write a mzML document with the appropriate set of params (both cvParams and userParams) in order to sufficiently annotate the supported data. I understand that it might be difficult to constrain userParams uses, but the tools might be generating a doc that defines their own params definitions for 3rd party tools that would need these terms. Best Pierre-Alain Marc Sturm wrote: > Hi Mat, > >> I admit I was hasty to say there was no difference. As you point out, >> the CV way makes it a "categorized" comment wheres the userParam is >> totally uncategorized. I still think a special term for such a comment >> is counter-productive for encouraging inter-compatible software. >> >> Instead, a small modification to your software would allow you to >> enumerate all the userParams and cvParams in the array and output them >> in name-value pairs. So you'd have: >> array 0: >> name="array name" type="array type" units="signal-to-noise ratio" >> min: 234 >> max: 435435 >> avg: 4545 >> >> Putting in "categorized" comment terms is a slippery slope IMO. It would >> lead to other similar terms in other places and ultimately we'd be like >> mzData with little or no control over the values of string variables, >> doom and gloom notwithstanding. ;) At least is could lead to >> implementations of mzML that are widely incompatible. >> >> > No offense , but your suggestion is only a workaround and ignores the > real problem. What if there are 10, 20 or 50. We also have a GUI for the > statistics output. There, we simply do not have the space to display all > userParams. > > I still think there should be a designated place for a user-definable > name or identifier. Otherwise, the meaning of all arrays that have no > explicit CV name is lost (@Eric - this is what i meant when i said that > the userParam is unusable for other tools). I really do not care, how > this is implemented, but it should exist. Putting vital information like > this into userParam is the best way to produce non-inter-compatible > files IMHO. > >> As for 528 parameters across all your tools, how many of those are just >> for your data processing algorithms (that take raw data mzML as input >> and could write processed mzML as output, as opposed to a search engine >> which would write pepXML or analysisXML)? For dataProcessing, I at least >> would prefer algorithm-centric terms instead of tool-centric terms since >> many tools could implement the same algorithm and more importantly, many >> tools will run multiple algorithms. So we would have a term for boxcar >> smoothing, Savitzky-Golay smoothing, etc. Some algorithms might be >> coupled tightly with proprietary tools (e.g. Mascot Distiller or Protein >> Pilot's peak picking) but we can still call it something like the >> "Protein Pilot Peak Picker". :) >> >> > All of out tools only perform one algorithm. The parameters are used to > fine-tune the behavior of the algorithm. They are quite > implementation-specific and therefor sometimes change. We can see about > that after we've compiled the list of tools and short descriptions. > > > Best, > Marc > > > ------------------------------------------------------------------------- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://sourceforge.net/services/buy/index.php > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > |
From: Marc S. <st...@in...> - 2008-06-26 06:45:39
|
Hi Mat, > I admit I was hasty to say there was no difference. As you point out, > the CV way makes it a "categorized" comment wheres the userParam is > totally uncategorized. I still think a special term for such a comment > is counter-productive for encouraging inter-compatible software. > > Instead, a small modification to your software would allow you to > enumerate all the userParams and cvParams in the array and output them > in name-value pairs. So you'd have: > array 0: > name="array name" type="array type" units="signal-to-noise ratio" > min: 234 > max: 435435 > avg: 4545 > > Putting in "categorized" comment terms is a slippery slope IMO. It would > lead to other similar terms in other places and ultimately we'd be like > mzData with little or no control over the values of string variables, > doom and gloom notwithstanding. ;) At least is could lead to > implementations of mzML that are widely incompatible. > No offense , but your suggestion is only a workaround and ignores the real problem. What if there are 10, 20 or 50. We also have a GUI for the statistics output. There, we simply do not have the space to display all userParams. I still think there should be a designated place for a user-definable name or identifier. Otherwise, the meaning of all arrays that have no explicit CV name is lost (@Eric - this is what i meant when i said that the userParam is unusable for other tools). I really do not care, how this is implemented, but it should exist. Putting vital information like this into userParam is the best way to produce non-inter-compatible files IMHO. > As for 528 parameters across all your tools, how many of those are just > for your data processing algorithms (that take raw data mzML as input > and could write processed mzML as output, as opposed to a search engine > which would write pepXML or analysisXML)? For dataProcessing, I at least > would prefer algorithm-centric terms instead of tool-centric terms since > many tools could implement the same algorithm and more importantly, many > tools will run multiple algorithms. So we would have a term for boxcar > smoothing, Savitzky-Golay smoothing, etc. Some algorithms might be > coupled tightly with proprietary tools (e.g. Mascot Distiller or Protein > Pilot's peak picking) but we can still call it something like the > "Protein Pilot Peak Picker". :) > All of out tools only perform one algorithm. The parameters are used to fine-tune the behavior of the algorithm. They are quite implementation-specific and therefor sometimes change. We can see about that after we've compiled the list of tools and short descriptions. Best, Marc |
From: Eric D. <ede...@sy...> - 2008-06-26 03:17:11
|
Hi Marc, you make some good points, but it goes against our general design plan. We will have to discuss this at the next telecon, but I definitely want to hear from the other designers. One question. You say below: > Putting the name in the userParam is not a good idea because it makes > these arrays unusable for other tools - too custom in my opinion. Why is a userParam unusable for other tools? Any tool can use a userParam or a cvParam as it sees fit. It's just that cvParams are officially sanctioned concepts and the userParam is whatever you want it to be. We'll discuss at the next call. Thanks, Eric > -----Original Message----- > From: psi...@li... [mailto:psidev-ms-dev- > bo...@li...] On Behalf Of Marc Sturm > Sent: Wednesday, June 25, 2008 12:17 AM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] mzML's binary arrays > > Hi Eric, > > i agree that less custom encoding is desirable to make files exchangable > between different tools. However in this case i would go both ways, > depending on the significance of the term. > > (1) > I use these arrays to store debug information of an algorithm and add > quite a few arrays, depending on the charge states i look at: > > pattern_score_1 > pattern_score_2 > pattern_score_... (one per charge) > intensity_score > local_maximum > trace_score > > Putting the name in the userParam is not a good idea because it makes > these arrays unusable for other tools - too custom in my opinion. > Adding a CV term for each such debug variable would however be too much. > So i think the intermediate way is just right: > Terms which are not generally usable should be put to a 'named custom > array'. This would correpond to an optional XML attribute 'name' for the > 'binaryDataArray' tag. > But we have to state clearly in the documentation that for more general > terms, a CV entry should be added. > > (2) > After peak picking we store much more information than the position and > intensity. The arrays there are: > > SignalToNoise > fwhm > leftWidth > rightWidth > maximumIntensity > peakShape > rValue > > 'SignalToNoise' is alread a CV term. 'fwhm' would be a good candidate > for a CV term as well. > The rest is more algorihtm-dependent and no general concept which is why > we could simply store them in a 'named custom array'. > > What do you think? > > Best, > Marc > > > Hi Marc, I think we would be better off creating CV terms for all the > > kinds of arrays people want to encode. So I'm much rather get a request > > that someone's software wants to write out "full width at half maximum" > > and create a term, furnish an accession number, and thereby publicly let > > all writer and reader authors know that this is a legal entity that > > could occur. No schema change is necessary. > > > > I find this preferable to having a vague slot that could be filled with > > > > full width at half maximum > > full width at half max > > FWHM > > > > in an uncontrolled and variable way. > > > > This is our general aim for mzML. We would like to steer away from > > custom ways of encoding data as much as possible. > > > > Does that seem reasonable? > > > > Would you like "full width at half maximum" to be added to the CV? > > > > ------------------------------------------------------------------------ - > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://sourceforge.net/services/buy/index.php > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |
From: Matthew C. <mat...@va...> - 2008-06-25 15:03:32
|
Hi Marc, I admit I was hasty to say there was no difference. As you point out, the CV way makes it a "categorized" comment wheres the userParam is totally uncategorized. I still think a special term for such a comment is counter-productive for encouraging inter-compatible software. Instead, a small modification to your software would allow you to enumerate all the userParams and cvParams in the array and output them in name-value pairs. So you'd have: array 0: name="array name" type="array type" units="signal-to-noise ratio" min: 234 max: 435435 avg: 4545 Putting in "categorized" comment terms is a slippery slope IMO. It would lead to other similar terms in other places and ultimately we'd be like mzData with little or no control over the values of string variables, doom and gloom notwithstanding. ;) At least is could lead to implementations of mzML that are widely incompatible. As for 528 parameters across all your tools, how many of those are just for your data processing algorithms (that take raw data mzML as input and could write processed mzML as output, as opposed to a search engine which would write pepXML or analysisXML)? For dataProcessing, I at least would prefer algorithm-centric terms instead of tool-centric terms since many tools could implement the same algorithm and more importantly, many tools will run multiple algorithms. So we would have a term for boxcar smoothing, Savitzky-Golay smoothing, etc. Some algorithms might be coupled tightly with proprietary tools (e.g. Mascot Distiller or Protein Pilot's peak picking) but we can still call it something like the "Protein Pilot Peak Picker". :) -Matt Marc Sturm wrote: > Hi Matt, > >> I can't think of any notable semantic difference between userParams and >> cvParams with uncontrolled string values. In both cases, you would have >> to deal with an extra term that only your algorithm and its downstream >> users know about. In both cases, the extra algorithm-specific arrays are >> unusable for other tools (except ones you make of course or that are >> made specifically to work with it). In both cases, the uncontrolled >> string value cannot be relied on except in very controlled circumstances. >> >> > That's not entirely true in my opinion. We have a small command line > tool that can display statistics about these arrays. > Without a defined way to give the array a name the statistics would look > like that: > > array 0: > min: 234 > max: 435435 > avg: 4545 > array 1: > min: 234 > max: 435435 > avg: 4545 > > With a defined way of naming arrays it would look like that: > > array 'some descritpion of the array content 1': > min: 234 > max: 435435 > avg: 4545 > array 'some descritpion of the array content 2': > min: 234 > max: 435435 > avg: 4545 > > At least to me the second alternative looks much better. Of cause we can > store the name in the userParam 'name'. > But other tools would store it in the userParam 'Name' or 'custom_name' > or 'custom name' ... > I really think there should be a controlled way to give an array a > user-defined name. > There was a way in mzData (optional XML attribute 'name') and it's a > step back not to have one in mzML. > > >> However, if your peak picking algorithm is versioned, it's exactly the >> kind of thing we want in the CV. We want a term to briefly describe the >> algorithm (which would go in dataProcessing) and also terms to describe >> the parameters that a user can set. At the same time, CV terms for your >> custom extra arrays could be added as well. >> >> > We'll compile a list of the TOPP tools with short descriptions and post > it on this mailing list. > The parameters will most likely not be included as the current count of > parameters for all tools is 538 and they might change from release to > release. > > Best, > Marc > > ------------------------------------------------------------------------- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://sourceforge.net/services/buy/index.php > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > > |
From: Marc S. <st...@in...> - 2008-06-25 14:15:42
|
Hi Matt, > I can't think of any notable semantic difference between userParams and > cvParams with uncontrolled string values. In both cases, you would have > to deal with an extra term that only your algorithm and its downstream > users know about. In both cases, the extra algorithm-specific arrays are > unusable for other tools (except ones you make of course or that are > made specifically to work with it). In both cases, the uncontrolled > string value cannot be relied on except in very controlled circumstances. > That's not entirely true in my opinion. We have a small command line tool that can display statistics about these arrays. Without a defined way to give the array a name the statistics would look like that: array 0: min: 234 max: 435435 avg: 4545 array 1: min: 234 max: 435435 avg: 4545 With a defined way of naming arrays it would look like that: array 'some descritpion of the array content 1': min: 234 max: 435435 avg: 4545 array 'some descritpion of the array content 2': min: 234 max: 435435 avg: 4545 At least to me the second alternative looks much better. Of cause we can store the name in the userParam 'name'. But other tools would store it in the userParam 'Name' or 'custom_name' or 'custom name' ... I really think there should be a controlled way to give an array a user-defined name. There was a way in mzData (optional XML attribute 'name') and it's a step back not to have one in mzML. > However, if your peak picking algorithm is versioned, it's exactly the > kind of thing we want in the CV. We want a term to briefly describe the > algorithm (which would go in dataProcessing) and also terms to describe > the parameters that a user can set. At the same time, CV terms for your > custom extra arrays could be added as well. > We'll compile a list of the TOPP tools with short descriptions and post it on this mailing list. The parameters will most likely not be included as the current count of parameters for all tools is 538 and they might change from release to release. Best, Marc |
From: Slotta, D. (NIH/NLM/N. [E] <sl...@nc...> - 2008-06-25 13:44:40
|
A translation for those of you lucky enough not have an office mate whose native language is German: "Hey, If I am reading this right, the FeatureFinder will soon spit out mzML as well as FeatureXML?" Greetings, Steffen" Douglas > -----Original Message----- > From: sneumann [mailto:sne...@ip...] > Sent: Wednesday, June 25, 2008 6:23 AM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] mzML's binary arrays > > Moin! > > Sehe ich das richtig, dass der FeatureFinder demnächst neben FeatureXML > auch mzML ausspuckt ? > > Gruss, > Steffen > > On Mi, 2008-06-25 at 09:16 +0200, Marc Sturm wrote: > > Hi Eric, > > > > i agree that less custom encoding is desirable to make files > > exchangable between different tools. However in this case i would go > > both ways, depending on the significance of the term. > > > > (1) > > I use these arrays to store debug information of an algorithm and add > > quite a few arrays, depending on the charge states i look at: > > > > pattern_score_1 > > pattern_score_2 > > pattern_score_... (one per charge) > > intensity_score > > local_maximum > > trace_score > > > > Putting the name in the userParam is not a good idea because it makes > > these arrays unusable for other tools - too custom in my opinion. > > Adding a CV term for each such debug variable would however be too > much. > > So i think the intermediate way is just right: > > Terms which are not generally usable should be put to a 'named custom > > array'. This would correpond to an optional XML attribute 'name' for > > the 'binaryDataArray' tag. > > But we have to state clearly in the documentation that for more > > general terms, a CV entry should be added. > > > > (2) > > After peak picking we store much more information than the position > > and intensity. The arrays there are: > > > > SignalToNoise > > fwhm > > leftWidth > > rightWidth > > maximumIntensity > > peakShape > > rValue > > > > 'SignalToNoise' is alread a CV term. 'fwhm' would be a good candidate > > for a CV term as well. > > The rest is more algorihtm-dependent and no general concept which is > > why we could simply store them in a 'named custom array'. > > > > What do you think? > > > > Best, > > Marc > > > > > Hi Marc, I think we would be better off creating CV terms for all > > > the kinds of arrays people want to encode. So I'm much rather get a > > > request that someone's software wants to write out "full width at > half maximum" > > > and create a term, furnish an accession number, and thereby > publicly > > > let all writer and reader authors know that this is a legal entity > > > that could occur. No schema change is necessary. > > > > > > I find this preferable to having a vague slot that could be filled > > > with > > > > > > full width at half maximum > > > full width at half max > > > FWHM > > > > > > in an uncontrolled and variable way. > > > > > > This is our general aim for mzML. We would like to steer away from > > > custom ways of encoding data as much as possible. > > > > > > Does that seem reasonable? > > > > > > Would you like "full width at half maximum" to be added to the CV? > > > > > > > > --------------------------------------------------------------------- > - > > --- Check out the new SourceForge.net Marketplace. > > It's the best place to buy or sell services for just about anything > > Open Source. > > http://sourceforge.net/services/buy/index.php > > _______________________________________________ > > Psidev-ms-dev mailing list > > Psi...@li... > > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev > -- > IPB Halle AG Massenspektrometrie & Bioinformatik > Dr. Steffen Neumann http://www.IPB-Halle.DE > Weinberg 3 http://msbi.bic-gh.de > 06120 Halle Tel. +49 (0) 345 5582 - 1470 > +49 (0) 345 5582 - 0 > sneumann(at)IPB-Halle.DE Fax. +49 (0) 345 5582 - 1409 |
From: Matt C. <mat...@va...> - 2008-06-25 12:50:27
|
Hi Marc, I can't think of any notable semantic difference between userParams and cvParams with uncontrolled string values. In both cases, you would have to deal with an extra term that only your algorithm and its downstream users know about. In both cases, the extra algorithm-specific arrays are unusable for other tools (except ones you make of course or that are made specifically to work with it). In both cases, the uncontrolled string value cannot be relied on except in very controlled circumstances. However, if your peak picking algorithm is versioned, it's exactly the kind of thing we want in the CV. We want a term to briefly describe the algorithm (which would go in dataProcessing) and also terms to describe the parameters that a user can set. At the same time, CV terms for your custom extra arrays could be added as well. -Matt Marc Sturm wrote: > Hi Eric, > > i agree that less custom encoding is desirable to make files exchangable > between different tools. However in this case i would go both ways, > depending on the significance of the term. > > (1) > I use these arrays to store debug information of an algorithm and add > quite a few arrays, depending on the charge states i look at: > > pattern_score_1 > pattern_score_2 > pattern_score_... (one per charge) > intensity_score > local_maximum > trace_score > > Putting the name in the userParam is not a good idea because it makes > these arrays unusable for other tools - too custom in my opinion. > Adding a CV term for each such debug variable would however be too much. > So i think the intermediate way is just right: > Terms which are not generally usable should be put to a 'named custom > array'. This would correpond to an optional XML attribute 'name' for the > 'binaryDataArray' tag. > But we have to state clearly in the documentation that for more general > terms, a CV entry should be added. > > (2) > After peak picking we store much more information than the position and > intensity. The arrays there are: > > SignalToNoise > fwhm > leftWidth > rightWidth > maximumIntensity > peakShape > rValue > > 'SignalToNoise' is alread a CV term. 'fwhm' would be a good candidate > for a CV term as well. > The rest is more algorihtm-dependent and no general concept which is why > we could simply store them in a 'named custom array'. > > What do you think? > > Best, > Marc > > >> Hi Marc, I think we would be better off creating CV terms for all the >> kinds of arrays people want to encode. So I'm much rather get a request >> that someone's software wants to write out "full width at half maximum" >> and create a term, furnish an accession number, and thereby publicly let >> all writer and reader authors know that this is a legal entity that >> could occur. No schema change is necessary. >> >> I find this preferable to having a vague slot that could be filled with >> >> full width at half maximum >> full width at half max >> FWHM >> >> in an uncontrolled and variable way. >> >> This is our general aim for mzML. We would like to steer away from >> custom ways of encoding data as much as possible. >> >> Does that seem reasonable? >> >> Would you like "full width at half maximum" to be added to the CV? >> |
From: sneumann <sne...@ip...> - 2008-06-25 10:24:52
|
Moin! Sehe ich das richtig, dass der FeatureFinder demnächst neben FeatureXML auch mzML ausspuckt ? Gruss, Steffen On Mi, 2008-06-25 at 09:16 +0200, Marc Sturm wrote: > Hi Eric, > > i agree that less custom encoding is desirable to make files exchangable > between different tools. However in this case i would go both ways, > depending on the significance of the term. > > (1) > I use these arrays to store debug information of an algorithm and add > quite a few arrays, depending on the charge states i look at: > > pattern_score_1 > pattern_score_2 > pattern_score_... (one per charge) > intensity_score > local_maximum > trace_score > > Putting the name in the userParam is not a good idea because it makes > these arrays unusable for other tools - too custom in my opinion. > Adding a CV term for each such debug variable would however be too much. > So i think the intermediate way is just right: > Terms which are not generally usable should be put to a 'named custom > array'. This would correpond to an optional XML attribute 'name' for the > 'binaryDataArray' tag. > But we have to state clearly in the documentation that for more general > terms, a CV entry should be added. > > (2) > After peak picking we store much more information than the position and > intensity. The arrays there are: > > SignalToNoise > fwhm > leftWidth > rightWidth > maximumIntensity > peakShape > rValue > > 'SignalToNoise' is alread a CV term. 'fwhm' would be a good candidate > for a CV term as well. > The rest is more algorihtm-dependent and no general concept which is why > we could simply store them in a 'named custom array'. > > What do you think? > > Best, > Marc > > > Hi Marc, I think we would be better off creating CV terms for all the > > kinds of arrays people want to encode. So I'm much rather get a request > > that someone's software wants to write out "full width at half maximum" > > and create a term, furnish an accession number, and thereby publicly let > > all writer and reader authors know that this is a legal entity that > > could occur. No schema change is necessary. > > > > I find this preferable to having a vague slot that could be filled with > > > > full width at half maximum > > full width at half max > > FWHM > > > > in an uncontrolled and variable way. > > > > This is our general aim for mzML. We would like to steer away from > > custom ways of encoding data as much as possible. > > > > Does that seem reasonable? > > > > Would you like "full width at half maximum" to be added to the CV? > > > > ------------------------------------------------------------------------- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://sourceforge.net/services/buy/index.php > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev -- IPB Halle AG Massenspektrometrie & Bioinformatik Dr. Steffen Neumann http://www.IPB-Halle.DE Weinberg 3 http://msbi.bic-gh.de 06120 Halle Tel. +49 (0) 345 5582 - 1470 +49 (0) 345 5582 - 0 sneumann(at)IPB-Halle.DE Fax. +49 (0) 345 5582 - 1409 |
From: Marc S. <st...@in...> - 2008-06-25 07:16:29
|
Hi Eric, i agree that less custom encoding is desirable to make files exchangable between different tools. However in this case i would go both ways, depending on the significance of the term. (1) I use these arrays to store debug information of an algorithm and add quite a few arrays, depending on the charge states i look at: pattern_score_1 pattern_score_2 pattern_score_... (one per charge) intensity_score local_maximum trace_score Putting the name in the userParam is not a good idea because it makes these arrays unusable for other tools - too custom in my opinion. Adding a CV term for each such debug variable would however be too much. So i think the intermediate way is just right: Terms which are not generally usable should be put to a 'named custom array'. This would correpond to an optional XML attribute 'name' for the 'binaryDataArray' tag. But we have to state clearly in the documentation that for more general terms, a CV entry should be added. (2) After peak picking we store much more information than the position and intensity. The arrays there are: SignalToNoise fwhm leftWidth rightWidth maximumIntensity peakShape rValue 'SignalToNoise' is alread a CV term. 'fwhm' would be a good candidate for a CV term as well. The rest is more algorihtm-dependent and no general concept which is why we could simply store them in a 'named custom array'. What do you think? Best, Marc > Hi Marc, I think we would be better off creating CV terms for all the > kinds of arrays people want to encode. So I'm much rather get a request > that someone's software wants to write out "full width at half maximum" > and create a term, furnish an accession number, and thereby publicly let > all writer and reader authors know that this is a legal entity that > could occur. No schema change is necessary. > > I find this preferable to having a vague slot that could be filled with > > full width at half maximum > full width at half max > FWHM > > in an uncontrolled and variable way. > > This is our general aim for mzML. We would like to steer away from > custom ways of encoding data as much as possible. > > Does that seem reasonable? > > Would you like "full width at half maximum" to be added to the CV? |
From: Eric D. <ede...@sy...> - 2008-06-24 15:48:15
|
Hi Marc, I think we would be better off creating CV terms for all the kinds of arrays people want to encode. So I'm much rather get a request that someone's software wants to write out "full width at half maximum" and create a term, furnish an accession number, and thereby publicly let all writer and reader authors know that this is a legal entity that could occur. No schema change is necessary. I find this preferable to having a vague slot that could be filled with full width at half maximum full width at half max FWHM in an uncontrolled and variable way. This is our general aim for mzML. We would like to steer away from custom ways of encoding data as much as possible. Does that seem reasonable? Would you like "full width at half maximum" to be added to the CV? Thanks, Eric > -----Original Message----- > From: psi...@li... [mailto:psidev-ms-dev- > bo...@li...] On Behalf Of Marc Sturm > Sent: Monday, June 23, 2008 11:14 PM > To: Mass spectrometry standard development > Subject: Re: [Psidev-ms-dev] mzML's binary arrays > > Hi all, > > i think there should be a term 'named custom array' (we can discuss about > the name) that contains the name of the array in the 'value' attribute. > Putting the name in the UserParam is too unstructured in my opinion. There > won't be two tools that store the name in the same way... > > <binaryDataArray encodedLength="12"> > <cvParam cvRef="MS" accession="MS:1000523" name="64-bit float" > value=""/> > <cvParam cvRef="MS" accession="MS:1000576" name="no compression" > value=""/> > <cvParam cvRef="MS" accession="MS:????????" name="named custom array" > value="full width at half max"/> > <binary>AAAAAAAANEAAAAAAAA</binary> > </binaryDataArray> > > What do you think? > > Best, > Marc > > > If the supplemental array is defined in the CV (right now it's just m/z, > > time, and intensity), you can use a CV term. Otherwise, you'll have to > > use a userParam. If it's a kind of array you think should be in the CV, > > you can make that suggestion as well. I know we should probably put > > transient array in there at least (time array can be used for time > > domain data). For something like charge states, that's probably going to > > stay a userParam AFAIK. > > > > > > ------------------------------------------------------------------------ - > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://sourceforge.net/services/buy/index.php > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev |