From: Eric D. <ede...@sy...> - 2007-10-08 23:37:06
|
Regarding, RDF I would like to suggest that this is not an option at this time. RDF was in fact suggested at the DC meeting, and it was concluded that it is such a departure from current formats, that we cannot support it at this time. We do not have the resources to pull it off. =20 Having said that, I would summarize RDF as the antithesis of everything you want out of mzML. RDF can be oversimplified (by me) as essentially a listing of facts of: Subject verb predicate wherein each noun and verb is carefully defined in an ontology (not just a controlled vocabulary) such that true meaning can be inferred from unstructured data. So, in pseuoRDF, our documents would go like this: =20 Eric has_produced this_mzML_document Eric is_a contact Eric has_full_name Eric Deutsch Eric has_email_address ede...@fu...o This_mzML_document is_a mzML_document This_mzML_document contains_a_run run1 Spectrum1 was_generated_in_run run1 Spectrum1 has_type precursor_ion_scan =20 The structure is that there is no structure. You are free to list every fact that is relevant in any order. However, each noun and verb must be defined in the context of an ontology (or probably multiple ontologies). =20 The beauty is that no one ever needs to argue about xsd schemas or two different formats for the same thing any more. Wheee! =20 The um, downside, is that your software to deal (effectively) with it needs to be 10x more brilliant than the best piece of code you've written so far. =20 Cheers, Eric =20 =20 =20 ________________________________ From: psi...@li... [mailto:psi...@li...] On Behalf Of Brian Pratt Sent: Monday, October 08, 2007 4:14 PM To: psi...@li... Subject: Re: [Psidev-ms-dev] more is_a vs. part_of errors? =20 Hi Angel, =20 This may be a bit esoteric, but I wanted to ask what advantage RDF might have over the older W3C XML schema (.xsd). I'm unfamiliar with RDF, and from my 20 minutes of googling it appears rather more complex than .xsd - certainly more complex than it would need to be to handle the kinds of things mzData and mzXML do today, but I'm sure I'm flaunting my ignorance. =20 =20 I see that there are (but don't completely understand the nature of) relationships between RDF, OWL, OBO, and CV. Presumably you see some means of exploiting these relationships? I have a lot to learn if we go this route, but it sounds interesting. At least we'd get to say "semantic web" a lot, which sounds cool. =20 >> I believe that there is an OBO to RDF perl tools someplace. Maybe this (java, I think): http://www.cs.utexas.edu/~hamid/research/obo2owl.cgi =20 =20 Thanks, =20 Brian =20 ________________________________ From: psi...@li... [mailto:psi...@li...] On Behalf Of Angel Pizarro Sent: Saturday, October 06, 2007 5:17 PM To: Mass spectrometry standard development Subject: Re: [Psidev-ms-dev] more is_a vs. part_of errors? =20 I wouldn't spend too much time trying to parse OBO files into XML schema. The format grew out of a need for quick and dirty CV with some ontology structure editing and there is really only one library editor that works with it, namely the author's tools of the OBO format itself.=20 As a side note, and completely my own opinion, but if mzML were to use RDF schema for the schema and RDF for the CV, validation and everything else would fall into place. I believe that there is an OBO to RDF perl tools someplace.=20 - angel On 10/6/07, Matt Chambers <mat...@va...> wrote: Good catches in the CV. Who is in charge of maintaining it and are they reading this list? :) I agree with auto-generating a XML schema with full semantic relationships encoded in it, direct from the CV, but you haven't addressed the issue I mentioned earlier. To do the auto-generation into CV params (if we choose method A) will be very ugly but it will allow for synonyms on the category names and value names. To implement the cvParam categories as XML elements though, you lose the=20 ability to have synonyms for category names (unless you use the accession number of the category as the element name, which makes me shudder), but the final schema would look a lot nicer. -Matt Brian Pratt wrote:=20 > > There are a handful of other cases where it appears that the authors > have gotten "is a" and "part_of" confused. My proposed corrections (IN > CAPS) inline: > > MS:1000025 "magnetic field strength"=20 > > part of MS:1000480 "analyzer attribute" > > is a (PART_OF) MS:1000451 "analyzer description" > > part of MS:1000463 "instrument description" > > part of MS:0000000 "MZ controlled vocabularies" > > MS:1000024 "final MS exponent" > > part of MS:1000480 "analyzer attribute" > > is a (PART_OF) MS:1000451 "analyzer description"=20 > > part of MS:1000463 "instrument description" > > part of MS:0000000 "MZ controlled vocabularies" > > MS:1000022 "TOF Total Path Length" > > part of MS:1000480 "analyzer attribute"=20 > > is a (PART_OF) MS:1000451 "analyzer description" > > part of MS:1000463 "instrument description" > > part of MS:0000000 "MZ controlled vocabularies" > > MS:1000014 "accuracy" > > part of MS:1000480 "analyzer attribute" > > is a (PART_OF) MS:1000451 "analyzer description" > > part of MS:1000463 "instrument description"=20 > > part of MS:0000000 "MZ controlled vocabularies" > > MS:1000106 "on" > > is a MS:1000021 "reflectron state" > > part of MS:1000480 "analyzer attribute"=20 > > is a (PART_OF) MS:1000451 "analyzer description" > > part of MS:1000463 "instrument description" > > part of MS:0000000 "MZ controlled vocabularies" > > MS:1000105 "off" > > is a MS:1000021 "reflectron state" > > part of MS:1000480 "analyzer attribute" > > is a (PART_OF) MS:1000451 "analyzer description"=20 > > part of MS:1000463 "instrument description" > > part of MS:0000000 "MZ controlled vocabularies" > > The following changes would make the Thermo and ABI stuff look like=20 > all the other vendors: > > MS:1000495 "Applied Biosystems" > > part of (IS_A) MS:1000121 "ABI / SCIEX" > > is a MS:1000031 "model by vendor" >=20 > part of MS:1000463 "instrument description" > > part of MS:0000000 "MZ controlled vocabularies" > > MS:1000176 "MAT95XP Trap" > > is a (IS_A) MS:1000493 "Finnigan MAT"=20 > > part of MS:1000483 "Thermo Fisher Scientific" > > is a MS:1000031 "model by vendor" > > part of MS:1000463 "instrument description" > > part of MS:0000000 "MZ controlled vocabularies"=20 > > MS:1000175 "MAT95XP" > > is a MS:1000493 "Finnigan MAT" > > part of (IS_A) MS:1000483 "Thermo Fisher Scientific" > > is a MS:1000031 "model by vendor"=20 > > part of MS:1000463 "instrument description" > > part of MS:0000000 "MZ controlled vocabularies" > > MS:1000174 "MAT900XP Trap" > > is a MS:1000493 "Finnigan MAT"=20 > > part of (IS_A) MS:1000483 "Thermo Fisher Scientific" > > is a MS:1000031 "model by vendor" > > part of MS:1000463 "instrument description" > > part of MS:0000000 "MZ controlled vocabularies" > > MS:1000173 "MAT900XP" > > is a MS:1000493 "Finnigan MAT" > > part of (IS_A) MS:1000483 "Thermo Fisher Scientific"=20 > > is a MS:1000031 "model by vendor" > > part of MS:1000463 "instrument description" > > part of MS:0000000 "MZ controlled vocabularies" > > MS:1000172 "MAT253"=20 > > is a MS:1000493 "Finnigan MAT" > > part of (IS_A) MS:1000483 "Thermo Fisher Scientific" > > is a MS:1000031 "model by vendor" > > part of MS:1000463 "instrument description"=20 > > part of MS:0000000 "MZ controlled vocabularies" > > I still think there's a schema in there, albeit jammed in slightly > sideways at the moment. > > - Brian >=20 ------------------------------------------------------------------------ - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser.=20 Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Psidev-ms-dev mailing list Psi...@li... https://lists.sourceforge.net/lists/listinfo/psidev-ms-dev --=20 Angel Pizarro Director, Bioinformatics Facility Institute for Translational Medicine and Therapeutics University of Pennsylvania 806 BRB II/III 421 Curie Blvd. Philadelphia, PA 19104-6160 P: 215-573-3736=20 F: 215-573-9004=20 |