From: Pierre-Alain B. <pie...@is...> - 2008-07-02 09:20:27
|
Thanks David. a couple of questions, just to make sure: 1) in case of top-down approach, do we have to duplicate sequenceCollection information? as SpectrumIdentificationResult contains a PeptideEvidence refering to a Peptide element (and not to a DBSequence), identification is obligatory a Peptide? 2) and what about spectral library searches, do we have to have Peptide elements with possibly undefined explicit sequences to refer to from the SpectrumIdentificationResult (because non peptidic, or because not identified but good spectrum) 3) in the Peptide element, the Modifications are defined in a much more detailed manner than in ModificationParams (PSI-MOD is there for instance). Does this simply mean that The ModificationParams codes the search engine settings and the Peptide includes the formal PSI definition of the Mod? And the only reference is the ModName value? 4) all mass values (sequenceMass, calculatedMassToCharge, experimentalMassToCharge, are not specified whether monoisotopic or averaged. Do we assume that averaged does not exist anymore? 5) is sequenceMass the mass value with/without the mods? If with, the name might be missleading (peptideMass would be more appropriate) 6) in case the DBSequence is nucleotide, is there a tag for saying this? (NB: MS on nucleotide molecules can be performed and analysed, not only MS on AA sequences that are interpreting nucleotide sequences). Or do we neglect MS experiments done on nucleotide molecules (and by the way on glycans...) and only represent the DBSequences as AA sequences (frame translations)? (and what about glycans?) Probaly can be solved if one can replace SequenceCollection by something else if needed (SmallMoleculeCollection, GlycanCollection, MoleculeCollection)... but the validator might not like this. 7) in case that DBSequence is nucleotide, do we represent the Peptide as AA sequence in case of MS done on proteins? That's all for the sequence representation so far Cheers, Pierre-Alain David Creasy wrote: > Thanks Andy, > > I've added an updated example document to SVN: > http://code.google.com/p/psi-pi/source/browse/trunk/examples/schema_usecase_examples/working27June/F001350.xml > > Problem is that we have now removed the main point of these recent > changes which was to add the decoy flag... I think that we need to add > isDecoy to SpectrumIdentificationItem. > > And yes, I suspect that we should go back to using the > ConceptualMoleculeCollection > Um, and since we've not actually ended up adding anything to > DBSequence... we haven't actually achieved anything? > I think we need to discuss this again at the next telecon. > > David > > Jones, Andy wrote: >> >> Hi all, >> >> >> >> I’ve updated the schema in SVN with the following main changes: >> >> >> >> - PeptideEvidence is now part of SpectrumIdentificationItem >> as discussed on the call (simple mappings to proteins are done at >> this level) >> >> - Added DBSequence that should be used instead of Sequence >> (following some of the discussion below) >> >> - Created a new collection class SequenceCollection (rather >> than ConceptualMoleculeCollection) so that only references can be >> given to DBSequence and Peptide >> >> o In fact, I’m not sure if this is sensible since it prevents other >> types of ConceptualMolecule being added later... to discuss >> >> - In FuGE on cvParam, the value attribute is no longer mandatory >> >> >> >> I’ve added a simple example that validates under >> examples\schema_usecase_examples\working27June >> >> >> >> Feel free to mail me any changes to make on Monday, >> >> Cheers >> >> Andy >> >> >> >> >> >> >> >> *From:* psi...@li... >> [mailto:psi...@li...] *On Behalf Of >> *Jones, Andy >> *Sent:* 27 June 2008 16:24 >> *To:* Angel Pizarro >> *Cc:* psi...@li... >> *Subject:* Re: [Psidev-pi-dev] FW: Representing Sequences >> >> >> >> I think Angel’s response below might not have made it round the list yet. >> >> >> >> I tend to agree that isDecoy is redundant information and perhaps >> this is not the best place to encode semantic information. An >> alternative would be to have a parameter, say on >> SpectrumIdentification for cvParam = “decoy_string” value = “Rev”. >> This would be a more compact representation and we would not have to >> add what is quite a specific attribute type (isDecoy) to Sequence. >> >> >> >> >> >> >> >> *From:* an...@it... [mailto:an...@it...] *On >> Behalf Of *Angel Pizarro >> *Sent:* 27 June 2008 15:59 >> *To:* Jones, Andy >> *Cc:* psi...@li... >> *Subject:* Re: [Psidev-pi-dev] FW: Representing Sequences >> >> >> >> my 2¢ : >> You need to be able to extend this to all molecule types, or am I >> missing the point of this thread, and you mean that this would be a >> suclass of the conceptual molecule element? >> >> Second, and this is is tangentially related, but are decoy sequences >> really a problem we should be putting our effort into? Is it in our >> domain to encode semantic information about a sequence, and possibly >> relating reported sequences as part of our schema? >> On a personal level I could care less if "isDecoy" is an attribute or >> not, but the temptation then would be for folks to encode the same >> accession for two different sequences, effectively making the primary >> key of the sequence object (accession, isDecoy) >> >> Do we want to go there? >> >> On Fri, Jun 27, 2008 at 10:21 AM, Jones, Andy >> <And...@li... <mailto:And...@li...>> >> wrote: >> >> So how about include length as an attribute and then let all other >> things go in the CV (pI, mass, etc.)? >> >> >> >> >> >> >> >> *From:* Jones, Andy >> *Sent:* 27 June 2008 14:54 >> *To:* 'David Creasy' >> *Subject:* RE: [Psidev-pi-dev] Representing Sequences >> >> >> >> id and name are standard for all elements that inherit from FuGE >> identifiable – this is perhaps a separate discussion as to whether >> the optional name attribute should be there. >> >> >> >> I agree that length may be useful – is this just an integer value >> with no unit? >> >> Yes, I think so. >> >> I'm less sure about pI and mass since mass at least can be calculated >> very simply >> >> Only if you have the sequence... (we have residue masses in the file). >> >> >> >> >> >> , and pI values (in my opinion) are pretty inaccurate and fairly >> meaningless >> >> Scandalous! (I happen to agree, but now some people will never speak >> to either of us ever again). >> >> The main problem with mass and pI is that these are 'irrelevant' if >> the sequence is nuleic acid rather than residues. >> Why not just allow CV there? We can share the same CV as the PEFF >> format, which includes, taxonomy, sequence type, gene ID, and lots of >> wonderful other things? >> >> – unless someone can convince me otherwise? >> >> Cheers >> >> Andy >> >> >> >> >> >> *From:* David Creasy [mailto:dc...@ma...] >> *Sent:* 27 June 2008 14:51 >> *To:* Jones, Andy >> *Cc:* psi...@li... >> <mailto:psi...@li...> >> *Subject:* Re: [Psidev-pi-dev] Representing Sequences >> >> >> >> Hi Andy, >> >> length may be useful, because some people won't want to output the >> actual sequence for space reasons. The other things we wanted to add >> before were pI and mass. >> Why do we want name? Is this for, say, a description line? >> (Also, identifier -> id?) >> >> David >> >> Jones, Andy wrote: >> >> Hi all, >> >> >> >> It was decided on the call that we would like to flag that Sequences >> in the ConceptualMoleculeCollection should have a Boolean attribute >> to capture if they are decoy sequences. At the moment we are using >> the FuGE:Sequence element. I don't really want to add another >> attribute to this (it's less problematic cutting down FuGE than >> adding new things), so I'm wondering if we should define our own >> Sequence type in AnalysisXML. This would also allow us to choose >> exactly the relevant attributes. At the moment, Sequence can have all >> of the following: >> >> >> >> <pf:Sequence isCircular="true" >> sequence="String" length="0" isApproximateLength="true" >> SequenceAnnotationSet_ref="String" start="0" end="0" >> identifier="String" name="String"> >> >> >> >> Several of these attributes were created to represent concepts that >> probably will never be required or implemented in AnalysisXML. How >> about the following: >> >> >> >> <DBSequence identifier = "" name = "" isDecoy = "true"> >> >> <seq>MCTMG...</seq> >> >> <pf:DatabaseReference Database_ref="" >> accession="Rev_IPI00013808.1"/> >> >> </DBSequence> >> >> >> >> Are any of the other attributes on Sequence actually required? I'll >> post a new version of the schema with other changes WRT to >> PeptideEvidence shortly, >> >> Cheers >> >> Andy >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------ >> >> >> >> >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------- >> Check out the new SourceForge.net Marketplace. >> It's the best place to buy or sell services for >> just about anything Open Source. >> http://sourceforge.net/services/buy/index.php >> >> >> >> >> >> >> ------------------------------------------------------------------------ >> >> >> >> >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> Psidev-pi-dev mailing list >> Psi...@li... <mailto:Psi...@li...> >> https://lists.sourceforge.net/lists/listinfo/psidev-pi-dev >> >> >> >> >> -- >> David Creasy >> Matrix Science >> 64 Baker Street >> London W1U 7GB, UK >> Tel: +44 (0)20 7486 1050 >> Fax: +44 (0)20 7224 1344 >> >> dc...@ma... <mailto:dc...@ma...> >> http://www.matrixscience.com >> >> Matrix Science Ltd. is registered in England and Wales >> Company number 3533898 >> >> >> >> ------------------------------------------------------------------------ >> >> >> >> >> >> >> >> ------------------------------------------------------------------------- >> Check out the new SourceForge.net Marketplace. >> It's the best place to buy or sell services for >> just about anything Open Source. >> http://sourceforge.net/services/buy/index.php >> >> >> >> >> ------------------------------------------------------------------------ >> >> >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> Psidev-pi-dev mailing list >> Psi...@li... <mailto:Psi...@li...> >> https://lists.sourceforge.net/lists/listinfo/psidev-pi-dev >> >> >> >> >> -- >> David Creasy >> Matrix Science >> 64 Baker Street >> London W1U 7GB, UK >> Tel: +44 (0)20 7486 1050 >> Fax: +44 (0)20 7224 1344 >> >> dc...@ma... <mailto:dc...@ma...> >> http://www.matrixscience.com >> >> Matrix Science Ltd. is registered in England and Wales >> Company number 3533898 >> >> >> ------------------------------------------------------------------------- >> Check out the new SourceForge.net Marketplace. >> It's the best place to buy or sell services for >> just about anything Open Source. >> http://sourceforge.net/services/buy/index.php >> _______________________________________________ >> Psidev-pi-dev mailing list >> Psi...@li... >> <mailto:Psi...@li...> >> https://lists.sourceforge.net/lists/listinfo/psidev-pi-dev >> >> >> >> >> -- >> Angel Pizarro >> Director, ITMAT Bioinformatics Facility >> 806 Biological Research Building >> 421 Curie Blvd. >> Philadelphia, PA 19104-6160 >> 215-573-3736 >> >> ------------------------------------------------------------------------ >> >> ------------------------------------------------------------------------- >> Check out the new SourceForge.net Marketplace. >> It's the best place to buy or sell services for >> just about anything Open Source. >> http://sourceforge.net/services/buy/index.php >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> Psidev-pi-dev mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psidev-pi-dev >> > > -- > David Creasy > Matrix Science > 64 Baker Street > London W1U 7GB, UK > Tel: +44 (0)20 7486 1050 > Fax: +44 (0)20 7224 1344 > > dc...@ma... > http://www.matrixscience.com > > Matrix Science Ltd. is registered in England and Wales > Company number 3533898 > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://sourceforge.net/services/buy/index.php > ------------------------------------------------------------------------ > > _______________________________________________ > Psidev-pi-dev mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psidev-pi-dev > |