From: Nicolas Le n. <le...@eb...> - 2011-01-17 14:44:45
|
Hi Neil, But if the purpose is to cope with a given tool that is not supporting the whole specification of SBML (controlled RDF has been introduced in L2V2 in 2005), is-it not a case for a proprietary annotation rather than bloating the common annotation scheme? If COBRA does not support SBML controlled and MIRIAM URIs, presumably, it does not support bq qualifiers as well. So adding the qualifiers would not change much. What you need would rather be an annotation in the COBRA namespace that replaces the notes of your example. Alternatively we could extend the scheme we used for InChIs. Now, that led me to a reflexion that is a bit deeper than that, and I think it should be moved to sbml-discuss at some point. Several times in the past, people have wanted to add metadata to SBML files, rather than pointers to metadata. This is the case with your example. FORMULA or CHARGE are not fundamentally different from SEQUENCE or STRUCTURE. I wonder if the time has not come to design another metadata framework, parallel to the RDF one, which purpose would be to store the information together with the model. If we do that, we probably do not want to use elements such as <formular>, <charge>, <sequence> etc. However, we would like the element to be identifiers from MIRIAM Resources (NOT MIRIAM URNs, the identifiers of a data-type in MIRIAM Resources), PATO quality branch, the future BioDbCore etc. The spec would not list the elements, but the accepted source for the elements. This is food for another L3 package. On 17/01/11 14:19, Neil Swainston wrote: > Hi Nick (and Nicolas - this covers your recent mail re: InChIs, too), > > You're right with this example - we do actually have a ChEBI identifier. > Where possible, we attempt to add "unknown" metabolites to ChEBI, who > curate and publish them super-quickly. (We have no problems with the speed > of response of the ChEBI guys). > > The problem is a pragmatic one. These large models are analysed by software > that lags behind what we're doing with annotations. An example is the COBRA > Toolbox, which - like it or not - has become a well used tool for analysing > genome-scale metabolic models. This has been written in Matlab by people > who have no interest whatsoever in annotations. Their algorithms require > access to formulae and charges and such like, and the writers of the > software certainly wouldn't go to the effort of querying MIRIAM and ChEBI > webservices in order to pick these up. So, if we relied on annotations > alone, they simply wouldn't use the model, as it wouldn't (in their eyes) > contain sufficient meta-data to allow them to run their analyses over. > > Although I can see that duplicating this info by specifying both a ChEBI > id, and adding terms like formulae to the notes is a bit rubbish from a > clean-and-proper sense, I think that we have to be realistic. Which is why > (in this case) I support both crappy notes and semantic annotations. > > I believe, however, that there may be a middle-ground that would allow > properties to be specified more formally without expecting developers of > software packages to rely on querying web services to get these properties. > Even if that is the better way of doing things, and what we're doing now is > effectively warehousing some of the data from ChEBI. It's just a question > of being realistic with where we are. > > Cheers, > > Neil. > > Neil Swainston > Experimental Officer > > Manchester Centre for Integrative Systems Biology > Manchester Interdisciplinary Biocentre > University of Manchester > Manchester M1 7DN > United Kingdom > > On 17 Jan 2011, at 13:52, Nick Juty wrote: > >> Hi Neil, >> >> Couple of questions - I'm trying to figure out what you use this >> information for... >> >> The example you have below contains much of the info that is held in a >> ChEBI entry: >> http://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI:18090 >> >> Are you, in effect, using the model file to store information in lieu of >> that information being accessible through a MIRIAM identifier? Also, when >> you find such entities (which have no associable database reference), do >> you submit them to ChEBI for curation/incorporation? And if so, how long >> do they take to appear, and do you subsequently remove them from your >> files? Or alternatively, do you need this level of information for each >> entity in your model? >> >> cheers >> >> Nick >> >> >> >> >> >> >> Neil Swainston wrote: >>> Hmm... interesting questions. >>> I'm not sure that I know the answers, but I'll try. >>> The reason that SMILES, InChIs, charges, etc. have come into my >>> consciousness is that I'm currently wrestling with the human >>> genome-scale model, in which we have a number of metabolites for which >>> we *don't* have a KEGG or ChEBI identifier, but do know some properties >>> of the molecule. >>> We currently do this in a Palsson-friendly, COBRA-compatible way of >>> specifying them in the notes: >>> <species metaid="_meta_bamppald_c" id="_bamppald_c" >>> name="beta-Aminopropion aldehyde" compartment="c" sboTerm="SBO:0000247"> >>> <notes> >>> <body xmlns="http://www.w3.org/1999/xhtml"> >>> <p>NEUTRAL_FORMULA: C3H7NO</p> >>> <p>FORMULA: C3H8NO</p> >>> <p>CHARGE: 1</p> >>> <p>INCHI: InChI=1/C3H7NO/c4-2-1-3-5/h3H,1-2,4H2</p> >>> <p>ORIGIN: Recon 1</p> >>> </body> >>> </notes> >>> Not ideal, but currently necessary, as we have no other means of >>> specifying this. (Note that "charge" has been (correctly) dropped as an >>> attribute of species, as it should be an annotation. An annotation that >>> we currently can't express). >>> This problem has also been encountered by Brett and Frank, and they >>> propose a solution in their Flux package extension: >>> http://precedings.nature.com/documents/4236/version/1/files/npre20104236-1.pdf >>> To me at, least, it would be bad to have numerous ways of specifying a >>> concept such as charge or formula in a number of package extensions. >>> Also, for me, specifying the concept of SMILES in MIRIAM would not be >>> the way to go. MIRIAM does an excellent job of collating and specifying >>> identifiers, not properties. I would have thought that SMILES (or InChI >>> or formula) would be a free text field, independent of MIRIAM >>> identifiers, but with the semantic information that this free text >>> actually represents a SMILES string (or whatever) being held in the >>> predicate. To specify SMILES as a data type would mean that MIRIAM would >>> have to enumerate all possible SMILES strings. It would make even less >>> sense to enumerate all possible charges. >>> Specifying... >>> SPECIES has DESCRIPTION whose value is C6H6O6 >>> SPECIES has DESCRIPTION whose value is -1 >>> ...is ambiguous. What kind of description? >>> Going back to what I wrote earlier about schema classes... >>> http://www.w3.org/TR/rdf-primer/#schemaclasses >>> ...this would give us the mechanism to describe the concept of a charge, >>> and constrain it to be an integer, for example. >>> By the way, I consider all this to be next generation stuff, so it >>> shouldn't affect the first Annot package proposal, which I hope to send >>> out to you all later in the week for more fireworks. >>> Cheers, >>> Neil. >>> Neil Swainston >>> Experimental Officer >>> Manchester Centre for Integrative Systems Biology >>> Manchester Interdisciplinary Biocentre >>> University of Manchester >>> Manchester M1 7DN >>> United Kingdom >>> On 17 Jan 2011, at 11:44, Nicolas Le novère wrote: >>>> Indeed, we are adding qualifiers on a regular basis. Latest examples >>>> are isDerivedFrom, hasProperty and isPropertyOf. >>>> >>>> I am worried about encoding particular qualities in qualifiers rather >>>> than generic relationships though. A chemical isDescribedBy a SMILE. As >>>> Nick pointed out, the SMILE itself would be a datatype if there was a >>>> namespace associated. Alternatively, the SMILE can be stored in the >>>> SBML file with a mechanism like the one proposed for InChIs. If we >>>> create a qualifier SMILE, then we basically need to create a qualifier >>>> for each entry of PATO as well. >>>> >>>> Regarding modification, I think we could think about it. Let's try to >>>> encode them with the extended package first. Since we reify, would-it >>>> be possible to create something like species X is entity Y modified by Z? >>>> >>>> On 17/01/11 11:32, Nick Juty wrote: >>>>> Hi Neil, >>>>> >>>>> For the initial question, I would say we are more than happy to add >>>>> qualifiers upon request. Thats not a problem at all. >>>>> I do think, however, that we need to think carefully about what is an >>>>> appropriate qualifer. They should be providing information on semantics / >>>>> relationships. Specifically, I would have thought that SMILES would be >>>>> more >>>>> like a datatype, and so the subject of a MIRIAM Resources entry. Bear in >>>>> mind also that we plan to have a second branch in MIRIAM Resources to >>>>> handle those datatypes that are currently not able to be added for >>>>> whatever >>>>> reason. For modification, that sounds more like an SBO concept... >>>>> If I am getting the wrong end of the stick though, please let me know. >>>>> >>>>> cheers >>>>> >>>>> Nick >>>>> >>>>> Neil Swainston wrote: >>>>>> Hi all, >>>>>> >>>>>> While we're at it, what are people's thoughts on adding new qualifiers? >>>>>> >>>>>> For example, I and others would like concepts like formula, charge, >>>>>> SMILES, etc. and apply them to metabolites. "Modification" is another >>>>>> that occasionally arises for proteins. >>>>>> Thoughts? >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Neil. >>>>>> On 14 Jan 2011, at 23:00, Nicolas Le novère <le...@eb...> wrote: >>>>>> >>>>>>> On 14/01/11 14:25, Neil Swainston wrote: >>>>>>> >>>>>>>> I think the goal is to recommend that the core annotations and "new" >>>>>>>> Annot annotations are used mutually exclusively. So, the core will use >>>>>>>> the old qualifiers and the Annot the new ones. In both cases, the >>>>>>>> qualifiers will be perennial. >>>>>>> Neil is correct. There won't be two schemes but only one scheme. >>>>>>> "bqbiol:hasPart" and "bqbiol:part" are not two qualifiers. They are two >>>>>>> forms of the same qualifier. The definition on biomodels.net will just >>>>>>> be amended as follow: >>>>>>> >>>>>>> hasPart, part >>>>>>> >>>>>>> The biological entity represented by the model component includes the >>>>>>> subject of the referenced resource, either physically or logically. This >>>>>>> relation might be used to link a complex to the description of its >>>>>>> components. >>>>>>> >>>>>>> >>>>>>>> I'm looking at the Annot specification now (sorry for the delay...) and >>>>>>>> in this I think that it would make sense to include a mapping from the >>>>>>>> core qualifiers to the new Annot ones (essentially, the finalised list >>>>>>>> that was produced by Nick J recently). >>>>>>>> >>>>>>>> Cheers, >>>>>>>> >>>>>>>> Neil. >>>>>>>> >>>>>>>> Neil Swainston Experimental Officer >>>>>>>> >>>>>>>> Manchester Centre for Integrative Systems Biology Manchester >>>>>>>> Interdisciplinary Biocentre University of Manchester Manchester M1 7DN >>>>>>>> United Kingdom >>>>>>>> >>>>>>>> On 14 Jan 2011, at 00:26, Mike Hucka wrote: >>>>>>>> >>>>>>>>> Maybe I missed it in the discussions so far, but is some versioning >>>>>>>>> going to be added, so that when reading a given model, a consumer can >>>>>>>>> figure out which qualifier scheme is being used? Or do you think it >>>>>>>>> won't be required, because the new terms will map one-to-one to the >>>>>>>>> old ones, and all a consumer would need to know is both terms for a >>>>>>>>> given relationship? >>>>>>>>> >>>>>>>>> MH >>>>>>>>> >>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> Protect Your Site and Customers from Malware Attacks >>>>>>>>> Learn about various malware tactics and how to avoid them. Understand >>>>>>>>> malware threats, the impact they can have on your business, and how >>>>>>>>> you can protect your company and customers by using code signing. >>>>>>>>> http://p.sf.net/sfu/oracle-sfdevnl >>>>>>>>> _______________________________________________ Sbml-annot mailing >>>>>>>>> list Sbm...@li... >>>>>>>>> https://lists.sourceforge.net/lists/listinfo/sbml-annot >>>>>>>> ------------------------------------------------------------------------------ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> Protect Your Site and Customers from Malware Attacks >>>>>>>> Learn about various malware tactics and how to avoid them. Understand >>>>>>>> malware threats, the impact they can have on your business, and how you >>>>>>>> can protect your company and customers by using code signing. >>>>>>>> http://p.sf.net/sfu/oracle-sfdevnl >>>>>>>> _______________________________________________ Sbml-annot mailing list >>>>>>>> Sbm...@li... >>>>>>>> https://lists.sourceforge.net/lists/listinfo/sbml-annot >>>>>>> -- >>>>>>> Nicolas LE NOVERE, Computational Systems Neurobiology, EMBL-EBI, WTGC, >>>>>>> Hinxton CB101SD UK, Mob:+447833147074, Tel:+441223494521 Fax:468, >>>>>>> Skype:n.lenovere, AIM:nlenovere, twitter:@lenovere >>>>>>> http://www.ebi.ac.uk/~lenov/, http://www.ebi.ac.uk/compneur/ >>>>>>> >>>>>>> >>>>>>> ------------------------------------------------------------------------------ >>>>>>> >>>>>>> Protect Your Site and Customers from Malware Attacks >>>>>>> Learn about various malware tactics and how to avoid them. Understand >>>>>>> malware threats, the impact they can have on your business, and how you >>>>>>> can protect your company and customers by using code signing. >>>>>>> http://p.sf.net/sfu/oracle-sfdevnl >>>>>>> _______________________________________________ >>>>>>> Sbml-annot mailing list >>>>>>> Sbm...@li... >>>>>>> https://lists.sourceforge.net/lists/listinfo/sbml-annot >>>>>> ------------------------------------------------------------------------------ >>>>>> >>>>>> Protect Your Site and Customers from Malware Attacks >>>>>> Learn about various malware tactics and how to avoid them. Understand >>>>>> malware threats, the impact they can have on your business, and how you >>>>>> can protect your company and customers by using code signing. >>>>>> http://p.sf.net/sfu/oracle-sfdevnl >>>>>> _______________________________________________ >>>>>> Sbml-annot mailing list >>>>>> Sbm...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/sbml-annot >>>> >>>> -- >>>> Nicolas LE NOVERE, Computational Systems Neurobiology, EMBL-EBI, WTGC, >>>> Hinxton CB101SD UK, Mob:+447833147074, Tel:+441223494521 Fax:468, >>>> Skype:n.lenovere, AIM:nlenovere, twitter:@lenovere >>>> http://www.ebi.ac.uk/~lenov/, http://www.ebi.ac.uk/compneur/ >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Protect Your Site and Customers from Malware Attacks >>>> Learn about various malware tactics and how to avoid them. Understand >>>> malware threats, the impact they can have on your business, and how you >>>> can protect your company and customers by using code signing. >>>> http://p.sf.net/sfu/oracle-sfdevnl >>>> _______________________________________________ >>>> BioModels.net Discussion Mailing List >>>> Bio...@li... >>>> Setting: https://lists.sourceforge.net/lists/listinfo/biomodels-net-discuss >>>> Website: http://www.biomodels.net >>> ------------------------------------------------------------------------------ >>> Protect Your Site and Customers from Malware Attacks >>> Learn about various malware tactics and how to avoid them. Understand >>> malware threats, the impact they can have on your business, and how you >>> can protect your company and customers by using code signing. >>> http://p.sf.net/sfu/oracle-sfdevnl >>> _______________________________________________ >>> Sbml-annot mailing list >>> Sbm...@li... >>> https://lists.sourceforge.net/lists/listinfo/sbml-annot > -- Nicolas LE NOVERE, Computational Systems Neurobiology, EMBL-EBI, WTGC, Hinxton CB101SD UK, Mob:+447833147074, Tel:+441223494521 Fax:468, Skype:n.lenovere, AIM:nlenovere, twitter:@lenovere http://www.ebi.ac.uk/~lenov/, http://www.ebi.ac.uk/compneur/ |