From: Andy J. <aj...@cs...> - 2005-11-29 12:00:26
|
Hi, I'm trying to convert data from various different instruments into mzData and I have a few questions. 1. One of the instruments produces plain text output for the peak list (peak [tab] intensity). Does anyone have a script or some code for turning this into the mzData base 64 binary. Otherwise, any advice for how best to do this would be welcome. 2. What is the current time scale for the updated version of mzData that is being discussed. If I produce parsers that convert to mzData v 1.05, will I need to re-write them fairly soon for the next version? 3. What are the current plans with respect to future versions of mzXML and mzData. I have tried out the ProteomeSystems converter. It runs very slowly over large files but it does produce valid mzData files, although I don't know how to check if the peak list conversion is correct. Is this a viable way of producing mzData if I can get mzXML files first? Does anyone have any experience of using the converter in practice. Any advice would be most appreciated, cheers, Andy |
From: simon a. (B. <sim...@bb...> - 2005-11-29 13:54:22
|
On 29 Nov 2005, at 12:01, Andy Jones wrote: > Hi, > I=92m trying to convert data from various different instruments into=20= > mzData and I have a few questions. > =A0 > 1 One of the instruments produces plain text output for = the peak=20 > list (peak [tab] intensity). Does anyone have a script or some code=20 > for turning this into the mzData base 64 binary. Otherwise, any advice=20= > for how best to do this would be welcome. You didn't mention which language you were using to do your processing.=20= When I was decoding Base64 in Java I used this nice little (free)=20 library, which also does encoding. http://iharder.sourceforge.net/base64/ Before the Base64 encoding you'll need to convert your float/double=20 values to byte arrays (4 bytes per float or 8 bytes per double), then=20 encode the byte array. The Float and Double classes have built-in=20 methods to convert to int / long and then you just need to split these=20= down to their component bytes. eg for a float f, to get the 4 little endian bytes would be int i =3D Float.float.ToIntBits(f); byte [] bytes =3D new byte[4]; bytes[0] =3D (i & 0xFF); bytes[1] =3D (i >> 8) & 0xFF; bytes[2] =3D (i>>16) & 0xFF; bytes[3] =3D (i>>24) & 0xFF; You'd then encode this to get the string you include in the mzData file: String s =3D Base64.encodeBytes(bytes); Hope this helps Simon. --=20 Simon Andrews PhD Bioinformatics Dept. The Babraham Institute sim...@bb... +44 (0) 1223 496463 |
From: Randy J. <rkj...@mi...> - 2005-11-30 03:47:34
|
PSI-MS Developers, Just a quick note regarding mzData 1.1 prototype. It now looks like the= =20 smart way to go is to provide a mzData 1.1 for review in January (as=20 agreed in Geneva) with the goal of completing the release at the Spring= =20 2006 meeting. My preference is for mzData 1.1 to be a maintenance=20 release adding new 'optional' elements which mean anyone generating 1.0= 5=20 files can continue to do so, and anyone who wants to use the new=20 features of 1.1 will not break any current parsers. The main reason for= =20 the proposed schedule is that we are making plans to merge mzData with=20 mzXML, and I don't see how this could be done easily before January=20 allowing sufficient review time before a meeting in the Spring. What=20 does seem possible is to keep mzData as stable as possible while=20 addressing the main complaints regarding redundancy of instrument=20 parameters by grouping of 'scans' into 'experiments' and providing a=20 mechanism for providing all of the instrument parameters for a 'group'=20 of scans. We could have the merged mzData/mzXML in draft mode by the=20 Spring meeting (if we get help) and finalized by the Fall meeting which= =20 could mean adoption in the Fall, or if there is more refinement needed=20 by the end of 2006. Keeping mzData stable should allow developers to add tools without the=20 worry you are expressing - that the thing will move too fast and preven= t=20 stable development. This is countered by people who say 'why should it=20 take so long? fix it and be done with it...' We could use more feedback= =20 on this. I have heard that the XSLT-based translator is slow on large files. Thi= s=20 is because there is a script which splits the mzXML joint mz/inten data= =20 vector into the separated vectors used in mzData. This could be done=20 very quickly in C/C++ or in Java. No one has written such a program yet= ,=20 but given the example of the ProteomeSystems converter, it would not=20 take anyone with a serious number of mzXML files to convert long to=20 write - I am not aware of anyone who has done this yet. I have also not heard of any attempt to resurrect the original Java GUI= =20 application Kai Runte wrote while at the EBI which could perform the=20 type of ASCII->mzData conversion you mentioned. This code base is in th= e=20 CVS tree and could be picked up by a Java-capable volunteer and made=20 compatible with 1.05 (and beyond). Anyone wishing to work on this pleas= e=20 contact me. =46inally, there are reports of a number of viewers - several groups se= em=20 to be working on them, and we have our own (which I'll share if you=20 want), so I don't have experience with anyone else's. This is how I=20 check the base64 strings. There are other base64/IEEE-float analysis=20 tools which you could use if you want to strip out the string for=20 testing, but I just use an mzData parsing viewer. It would be helpful i= f=20 we organized the toolset for mzData and gave pointers to things like=20 viewers, parsers and converters. If everyone who has a working mzData=20 tool will drop a line to this mailing list, I will make sure the tools=20 section gets updated. Randy Julian Andy Jones wrote: > Hi, > > I=92m trying to convert data from various different instruments into=20 > mzData and I have a few questions. > > 1. One of the instruments produces plain text output for the peak > list (peak [tab] intensity). Does anyone have a script or some > code for turning this into the mzData base 64 binary. Otherwise= , > any advice for how best to do this would be welcome. > 2. What is the current time scale for the updated version of mzDat= a > that is being discussed. If I produce parsers that convert to > mzData v 1.05, will I need to re-write them fairly soon for the > next version? > 3. What are the current plans with respect to future versions of > mzXML and mzData. I have tried out the ProteomeSystems > converter. It runs very slowly over large files but it does > produce valid mzData files, although I don=92t know how to chec= k > if the peak list conversion is correct. Is this a viable way of > producing mzData if I can get mzXML files first? Does anyone > have any experience of using the converter in practice. > > Any advice would be most appreciated, cheers, > > Andy > |
From: Jimmy E. <jk...@gm...> - 2005-11-30 04:22:12
|
Andy, One nice peak list data conversion tool which is capable of taking peak lists in various ASCII formats (Mascot generic format, PKL, DTA, etc.) and outputting to mzData is the Peak List Conversion Utility available at the Proteome Commons website http://www.proteomecommons.org/tools.jsp#File%20Format%20Manipulation It's an easy to use Java Web Start app that might suit your needs. - Jimmy Andy Jones wrote: > > > Hi, > > > > I'm trying to convert data from various different instruments into > > mzData and I have a few questions. > > > > 1. One of the instruments produces plain text output for the peak > > list (peak [tab] intensity). Does anyone have a script or some > > code for turning this into the mzData base 64 binary. Otherwise, > > any advice for how best to do this would be welcome. > > > |
From: Alexandre M. <ale...@ge...> - 2005-12-05 14:50:35
|
Hi, You may also have a look to the nearly release web tool http://insilicospectro.vital-it.ch/cgi/cgiConvertSpectra.pl regards Alex Jimmy Eng wrote: > Andy, > > One nice peak list data conversion tool which is capable of taking > peak lists in various ASCII formats (Mascot generic format, PKL, DTA, > etc.) and outputting to mzData is the Peak List Conversion Utility > available at the Proteome Commons website > > http://www.proteomecommons.org/tools.jsp#File%20Format%20Manipulation > > It's an easy to use Java Web Start app that might suit your needs. > > - Jimmy > > Andy Jones wrote: > > > Hi, > > > > I'm trying to convert data from various different instruments into > > mzData and I have a few questions. > > > > 1. One of the instruments produces plain text output for the peak > > list (peak [tab] intensity). Does anyone have a script or some > > code for turning this into the mzData base 64 binary. > Otherwise, > > any advice for how best to do this would be welcome. > > > > -- Alexandre Masselot, phD Senior bioinformatician www.genebio.com voice: +41 22 702 99 00 |
From: Brian P. <bri...@in...> - 2005-11-30 07:45:48
|
> If everyone who has a working mzData=20 > tool will drop a line to this mailing list, I will make sure=20 > the tools=20 > section gets updated. InsilicosViewer (available at www.insilicos.com) supports mzData. =20 The Trans-Proteomic Pipeline from the ISB also supports mzData (see = http://tools.proteomecenter.org/software.php). - Brian Pratt, Insilicos > -----Original Message----- > From: psi...@li...=20 > [mailto:psi...@li...] On Behalf=20 > Of Randy Julian > Sent: Tuesday, November 29, 2005 7:47 PM > To: Andy Jones > Cc: psi...@li... > Subject: Re: [Psidev-ms-dev] mzData issues >=20 > PSI-MS Developers, >=20 > Just a quick note regarding mzData 1.1 prototype. It now=20 > looks like the=20 > smart way to go is to provide a mzData 1.1 for review in January (as=20 > agreed in Geneva) with the goal of completing the release at=20 > the Spring=20 > 2006 meeting. My preference is for mzData 1.1 to be a maintenance=20 > release adding new 'optional' elements which mean anyone=20 > generating 1.05=20 > files can continue to do so, and anyone who wants to use the new=20 > features of 1.1 will not break any current parsers. The main=20 > reason for=20 > the proposed schedule is that we are making plans to merge=20 > mzData with=20 > mzXML, and I don't see how this could be done easily before January=20 > allowing sufficient review time before a meeting in the Spring. What=20 > does seem possible is to keep mzData as stable as possible while=20 > addressing the main complaints regarding redundancy of instrument=20 > parameters by grouping of 'scans' into 'experiments' and providing a=20 > mechanism for providing all of the instrument parameters for=20 > a 'group'=20 > of scans. We could have the merged mzData/mzXML in draft mode by the=20 > Spring meeting (if we get help) and finalized by the Fall=20 > meeting which=20 > could mean adoption in the Fall, or if there is more=20 > refinement needed=20 > by the end of 2006. >=20 > Keeping mzData stable should allow developers to add tools=20 > without the=20 > worry you are expressing - that the thing will move too fast=20 > and prevent=20 > stable development. This is countered by people who say 'why=20 > should it=20 > take so long? fix it and be done with it...' We could use=20 > more feedback=20 > on this. >=20 > I have heard that the XSLT-based translator is slow on large=20 > files. This=20 > is because there is a script which splits the mzXML joint=20 > mz/inten data=20 > vector into the separated vectors used in mzData. This could be done=20 > very quickly in C/C++ or in Java. No one has written such a=20 > program yet,=20 > but given the example of the ProteomeSystems converter, it would not=20 > take anyone with a serious number of mzXML files to convert long to=20 > write - I am not aware of anyone who has done this yet. >=20 > I have also not heard of any attempt to resurrect the=20 > original Java GUI=20 > application Kai Runte wrote while at the EBI which could perform the=20 > type of ASCII->mzData conversion you mentioned. This code=20 > base is in the=20 > CVS tree and could be picked up by a Java-capable volunteer and made=20 > compatible with 1.05 (and beyond). Anyone wishing to work on=20 > this please=20 > contact me. >=20 > Finally, there are reports of a number of viewers - several=20 > groups seem=20 > to be working on them, and we have our own (which I'll share if you=20 > want), so I don't have experience with anyone else's. This is how I=20 > check the base64 strings. There are other base64/IEEE-float analysis=20 > tools which you could use if you want to strip out the string for=20 > testing, but I just use an mzData parsing viewer. It would be=20 > helpful if=20 > we organized the toolset for mzData and gave pointers to things like=20 > viewers, parsers and converters. If everyone who has a working mzData=20 > tool will drop a line to this mailing list, I will make sure=20 > the tools=20 > section gets updated. >=20 > Randy Julian >=20 >=20 > Andy Jones wrote: >=20 > > Hi, > > > > I'm trying to convert data from various different instruments into=20 > > mzData and I have a few questions. > > > > 1. One of the instruments produces plain text output for the peak > > list (peak [tab] intensity). Does anyone have a script or some > > code for turning this into the mzData base 64 binary.=20 > Otherwise, > > any advice for how best to do this would be welcome. > > 2. What is the current time scale for the updated=20 > version of mzData > > that is being discussed. If I produce parsers that convert to > > mzData v 1.05, will I need to re-write them fairly=20 > soon for the > > next version? > > 3. What are the current plans with respect to future versions of > > mzXML and mzData. I have tried out the ProteomeSystems > > converter. It runs very slowly over large files but it does > > produce valid mzData files, although I don't know how to check > > if the peak list conversion is correct. Is this a=20 > viable way of > > producing mzData if I can get mzXML files first? Does anyone > > have any experience of using the converter in practice. > > > > Any advice would be most appreciated, cheers, > > > > Andy > > >=20 >=20 >=20 > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep=20 > through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. =20 > DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_idv37&alloc_id=16865&op=3Dick > _______________________________________________ > Psidev-ms-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev >=20 |
From: simon a. (B. <sim...@bb...> - 2005-12-15 11:03:09
|
On 30 Nov 2005, at 03:47, Randy Julian wrote: > If everyone who has a working mzData tool will drop a line to this > mailing list, I will make sure the tools section gets updated. After an age of waiting for official permission to release our code I've put a couple of tools which make use of mzData up onto our website. mzViewer is a simple viewer for mzData files. It also provides an API to embed an mzData viewer into your own java applications. PIMS is a larger LIMS application for proteomics which keeps track of samples, protocols and MS data. Both can be downloaded from our projects page: http://www.bioinformatics.bbsrc.ac.uk/projects/ Both tools are pretty new so we'd appreciate bug reports if anyone has problems with either of them. Simon. -- Simon Andrews PhD Bioinformatics Dept. The Babraham Institute sim...@bb... +44 (0) 1223 496463 |
From: David C. <dc...@ma...> - 2005-12-15 12:02:26
|
Hi Simon, Looks useful, but unfortunately fails to open files exported from data systems from Thermo, Sciex and Waters. The error is similar in each case - for example: Byte data array had length 56 which isn't a multiple of 8 Hopefully somebody from one of these manufacturers will be able to send you an example file. David simon andrews (BI) wrote: > On 30 Nov 2005, at 03:47, Randy Julian wrote: > >> If everyone who has a working mzData tool will drop a line to this >> mailing list, I will make sure the tools section gets updated. > > > After an age of waiting for official permission to release our code I've > put a couple of tools which make use of mzData up onto our website. > > mzViewer is a simple viewer for mzData files. It also provides an API > to embed an mzData viewer into your own java applications. > > PIMS is a larger LIMS application for proteomics which keeps track of > samples, protocols and MS data. > > Both can be downloaded from our projects page: > > http://www.bioinformatics.bbsrc.ac.uk/projects/ > > Both tools are pretty new so we'd appreciate bug reports if anyone has > problems with either of them. > > Simon. > -- David Creasy Matrix Science 8 Wyndham Place London W1H 1PP Tel +44 (0)20 7723 2142 Fax +44 (0)20 7725 9360 dc...@ma... http://www.matrixscience.com |
From: simon a. (B. <sim...@bb...> - 2005-12-15 12:07:51
|
On 15 Dec 2005, at 12:02, David Creasy wrote: > Hi Simon, > Looks useful, but unfortunately fails to open files exported from data > systems from Thermo, Sciex and Waters. The error is similar in each > case - for example: > Byte data array had length 56 which isn't a multiple of 8 I was kind of expecting this. The parsers were written off the example mzData files at the psidev site. I'm sure that other implementations will bring out subtle bugs. > Hopefully somebody from one of these manufacturers will be able to > send you an example file. I'd appreciate that. Bugs can be sent to me, or filed at: http://www.bioinformatics.bbsrc.ac.uk/bugzilla/ Cheers Simon. -- Simon Andrews PhD Bioinformatics Dept. The Babraham Institute sim...@bb... +44 (0) 1223 496463 |
From: Chris T. <chr...@eb...> - 2005-12-15 13:42:14
|
Hiya. The viewer looks nice -- I got it to work with one of the example files (myo_full) and it is clean and simple. No need to RTFM :) On PIMS, I saw this first on a poster at BSPR and meant to follow up on a particular issue; are you aware of the FuGE project (fuge.sf.net)? The model will soon be at milestone 2 (hopefully all of the issues sorted, and all the required functionality in). At least one other free LIMS-like system is implementing FuGE already (CPAS from the Fred Hutch CRC in Seattle) and maybe even a commercial firm I know of might switch to it once it is properly stable. I wonder if you have the person hours at your disposal to make PIMS able to im/export in FuGE? A big job, and maybe a bit over the top if you are seeking only to support your immediate user base but it would be cool (sharing/repositing workflows, third party tool accessibility etc.). Another thing would be the (very new) FuGO project (ontology). This project needs contributors and in that respect I'm wondering how much of a controlled vocabulary you've accumulated in developing PIMS? Cheers, Chris. P.S. Where do you put stuff collected through PIMS -- do you have a bespoke repository? Is that public? Is it anything like this: http://www.ebi.ac.uk/pride/ ? simon andrews (BI) wrote: > On 30 Nov 2005, at 03:47, Randy Julian wrote: > >> If everyone who has a working mzData tool will drop a line to this >> mailing list, I will make sure the tools section gets updated. > > > After an age of waiting for official permission to release our code I've > put a couple of tools which make use of mzData up onto our website. > > mzViewer is a simple viewer for mzData files. It also provides an API > to embed an mzData viewer into your own java applications. > > PIMS is a larger LIMS application for proteomics which keeps track of > samples, protocols and MS data. > > Both can be downloaded from our projects page: > > http://www.bioinformatics.bbsrc.ac.uk/projects/ > > Both tools are pretty new so we'd appreciate bug reports if anyone has > problems with either of them. > > Simon. > -- ~~~~~~~~~~~~~~~~~~~~~~~~ chr...@eb... http://psidev.sf.net/ ~~~~~~~~~~~~~~~~~~~~~~~~ |
From: simon a. (B. <sim...@bb...> - 2005-12-15 14:13:01
|
On 15 Dec 2005, at 13:41, Chris Taylor wrote: > Hiya. The viewer looks nice -- I got it to work with one of the > example files (myo_full) and it is clean and simple. No need to RTFM > :) Probably a good job as there isn't a FM yet :) Actually a couple of reports have suggested that it has problems with other mzdata files (it was written using the examples off psidev as we use an ABI machine and don't have an official mzData converter yet). It would be really useful to get some more examples on the psidev examples page from the major instrument manufacturers to help with developing mzData tools. > On PIMS, I saw this first on a poster at BSPR and meant to follow up > on a particular issue; are you aware of the FuGE project > (fuge.sf.net)? The model will soon be at milestone 2 (hopefully all of > the issues sorted, and all the required functionality in). You mentioned it when we spoke at BSPR. I have looked at it but haven't gone any further in implementing it in PIMS. The scope of Fuge seems to be somewhat larger than what we had intended for PIMS. PIMS is really a glorified sample tracking system which also stores data. It does model the experimental process, but in a very simple way. I'm eventually aiming to extend it to store the analysis of MS data as well (mzIdent and the like), but trying to strictly model the whole workflow would make it more complex than we would either need or want. > Another thing would be the (very new) FuGO project (ontology). This > project needs contributors and in that respect I'm wondering how much > of a controlled vocabulary you've accumulated in developing PIMS? We've incorporated existing ontologies wherever we can (eg Brenda for tissues, EBI taxonomy and the appropriate limits from the psidev ontology), but haven't developed our own. FuGO sounds interesting - I'm happy to extend our ontology support with whatever comes out of it though I don't know how useful I'd be at contributing to the actual ontology construction. > P.S. Where do you put stuff collected through PIMS -- do you have a > bespoke repository? Is that public? Is it anything like this: > http://www.ebi.ac.uk/pride/ ? There isn't a central repository - you can download the PIMS server from our site and create your own. I'm aiming to eventually allow automated export from PIMS into pride (I think we already collect all of the necessary information in a suitable form), but that's still on the to-do list. Our schema is included as part of the PIMS server download. TTFN Simon. -- Simon Andrews PhD Bioinformatics Dept. The Babraham Institute sim...@bb... +44 (0) 1223 496463 |
From: simon a. (B. <sim...@bb...> - 2005-12-20 15:02:02
|
Many thanks to those who responded to my post here last week about our mzData viewer. I've now received a number of extra example files from different sources which have enabled much more rigourous testing of our tool. I've put out an updated version of the viewer (v0.4) which addresses many of the issues people reported previously. Anyone who had problems with the last version should give this one a try as it's much more likely to work now. You can get the viewer from http://www.bioinformatics.bbsrc.ac.uk/projects/mzviewer/ Cheers Simon. -- Simon Andrews PhD Bioinformatics Dept. The Babraham Institute sim...@bb... +44 (0) 1223 496463 |