You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
(2) |
Jun
(20) |
Jul
(4) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(19) |
Aug
|
Sep
(1) |
Oct
(6) |
Nov
|
Dec
|
| 2004 |
Jan
|
Feb
(7) |
Mar
|
Apr
|
May
(9) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
(8) |
| 2005 |
Jan
|
Feb
(33) |
Mar
(16) |
Apr
(9) |
May
(7) |
Jun
|
Jul
(7) |
Aug
(17) |
Sep
(3) |
Oct
(16) |
Nov
|
Dec
(17) |
| 2006 |
Jan
(15) |
Feb
(13) |
Mar
(61) |
Apr
(10) |
May
(18) |
Jun
(14) |
Jul
(2) |
Aug
(9) |
Sep
|
Oct
|
Nov
(5) |
Dec
(17) |
| 2007 |
Jan
(7) |
Feb
(4) |
Mar
(4) |
Apr
(2) |
May
(6) |
Jun
(1) |
Jul
(18) |
Aug
(4) |
Sep
(2) |
Oct
|
Nov
(4) |
Dec
|
| 2008 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(16) |
Jun
(3) |
Jul
(8) |
Aug
(1) |
Sep
(2) |
Oct
|
Nov
(6) |
Dec
|
| 2009 |
Jan
|
Feb
(9) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
(15) |
Sep
|
Oct
|
Nov
|
Dec
|
| 2010 |
Jan
|
Feb
(1) |
Mar
(12) |
Apr
|
May
(2) |
Jun
(1) |
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
(4) |
Dec
(1) |
| 2011 |
Jan
|
Feb
|
Mar
|
Apr
(3) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(6) |
Dec
|
| 2015 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|
From: Peter Murray-R. <pm...@ca...> - 2015-03-31 21:14:54
|
I am afraid it is the W3C specification for XML names and we have adopted that . if you use any character like / or () it will break XML compliant software where this is a name. As you have it , it's a data item but some tools attempt to validate it. It may be more human-readable but it's machine unreadable. see http://www.xml.com/pub/a/2001/07/25/namingparts.html or the XML spec. And these names can expand into RDF URLs where again punctuation characters would break them. Great to have your interest P. On Tue, Mar 31, 2015 at 9:20 PM, Oliver Stueker <oli...@mu...> wrote: > Hi everyone, > I've cross-posted to the cml-discuss and quixote lists as I think I need > input from both groups. > > I have just started trying to validate my CML/CompChem files using a > standalone cml-validator.jar (from Bitbucket). > > And I get errors like: > <?xml version="1.0" encoding="UTF-8"?> > <report xmlns="http://www.xml-cml.org/report/"> > <final-report> > <well-formed-test> > <valid>xml is well formed</valid> > </well-formed-test> > <schema-validation-test> > <error>cvc-pattern-valid: Value 'x:QCISD(T)' is not > facet-valid with respect to pattern > '[A-Za-z][A-Za-z0-9_]*:[A-Za-z][A-Za-z0-9_\.\-]*' for type > 'namespaceRefType'.</error> > </schema-validation-test> > </final-report> > </report> > > complaining that my dictRef="x:QCISD(T)" contains parenthesis. > This is a pity because I would really like to use things like: > > <scalar dataType="xsd:double" dictRef="cc:Energy(0K)" > units="nonSi:hartree">-40.422101</scalar> > <scalar dataType="xsd:double" dictRef="cc:Energy(T)" > units="nonSi:hartree">-40.419229</scalar> > <scalar dataType="xsd:double" units="nonSi:hartree" > dictRef="g:energy(MP2/G3Bas1)">-40.3325515</scalar> > <scalar dataType="xsd:double" units="nonSi:hartree" > dictRef="g:energy(QCISD(T)/G3Bas1)">-40.3559402</scalar> > <scalar dataType="xsd:double" units="nonSi:hartree" > dictRef="g:energy(G3MP2)">-40.4221009</scalar> > > As I think they are more instructive than: > > <scalar dataType="xsd:double" dictRef="cc:Energy_0K" > units="nonSi:hartree">-40.422101</scalar> > <scalar dataType="xsd:double" dictRef="cc:Energy_T" > units="nonSi:hartree">-40.419229</scalar> > <scalar dataType="xsd:double" units="nonSi:hartree" > dictRef="g:energy_MP2_G3Bas1">-40.3325515</scalar> > <scalar dataType="xsd:double" units="nonSi:hartree" > dictRef="g:energy_QCISD_T_G3Bas1">-40.3559402</scalar> > <scalar dataType="xsd:double" units="nonSi:hartree" > dictRef="g:energy_G3MP2_">-40.4221009</scalar> > > Especially I find the "dictRef="g:energy(QCISD(T)/G3Bas1)" is much more > readable than dictRef="g:energy_QCISD_T_G3Bas1" . > > Is there a good reason to restrict the allowed characters of > namespaceRefType to '[A-Za-z0-9_\.\-]' ? > > I'm open for comments and suggestions. > > > Cheers, > Oliver > > > -- > Oliver Stueker, Dr. rer. nat. > Department of Chemistry, Memorial University > > -- > You received this message because you are subscribed to the Google Groups > "Quixote project on QC databases" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to qui...@go.... > For more options, visit https://groups.google.com/d/optout. > -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069 |
|
From: Oliver S. <oli...@mu...> - 2015-03-31 20:20:37
|
Hi everyone, I've cross-posted to the cml-discuss and quixote lists as I think I need input from both groups. I have just started trying to validate my CML/CompChem files using a standalone cml-validator.jar (from Bitbucket). And I get errors like: <?xml version="1.0" encoding="UTF-8"?> <report xmlns="http://www.xml-cml.org/report/"> <final-report> <well-formed-test> <valid>xml is well formed</valid> </well-formed-test> <schema-validation-test> <error>cvc-pattern-valid: Value 'x:QCISD(T)' is not facet-valid with respect to pattern '[A-Za-z][A-Za-z0-9_]*:[A-Za-z][A-Za-z0-9_\.\-]*' for type 'namespaceRefType'.</error> </schema-validation-test> </final-report> </report> complaining that my dictRef="x:QCISD(T)" contains parenthesis. This is a pity because I would really like to use things like: <scalar dataType="xsd:double" dictRef="cc:Energy(0K)" units="nonSi:hartree">-40.422101</scalar> <scalar dataType="xsd:double" dictRef="cc:Energy(T)" units="nonSi:hartree">-40.419229</scalar> <scalar dataType="xsd:double" units="nonSi:hartree" dictRef="g:energy(MP2/G3Bas1)">-40.3325515</scalar> <scalar dataType="xsd:double" units="nonSi:hartree" dictRef="g:energy(QCISD(T)/G3Bas1)">-40.3559402</scalar> <scalar dataType="xsd:double" units="nonSi:hartree" dictRef="g:energy(G3MP2)">-40.4221009</scalar> As I think they are more instructive than: <scalar dataType="xsd:double" dictRef="cc:Energy_0K" units="nonSi:hartree">-40.422101</scalar> <scalar dataType="xsd:double" dictRef="cc:Energy_T" units="nonSi:hartree">-40.419229</scalar> <scalar dataType="xsd:double" units="nonSi:hartree" dictRef="g:energy_MP2_G3Bas1">-40.3325515</scalar> <scalar dataType="xsd:double" units="nonSi:hartree" dictRef="g:energy_QCISD_T_G3Bas1">-40.3559402</scalar> <scalar dataType="xsd:double" units="nonSi:hartree" dictRef="g:energy_G3MP2_">-40.4221009</scalar> Especially I find the "dictRef="g:energy(QCISD(T)/G3Bas1)" is much more readable than dictRef="g:energy_QCISD_T_G3Bas1" . Is there a good reason to restrict the allowed characters of namespaceRefType to '[A-Za-z0-9_\.\-]' ? I'm open for comments and suggestions. Cheers, Oliver -- Oliver Stueker, Dr. rer. nat. Department of Chemistry, Memorial University |
|
From: Oliver S. <rev...@us...> - 2014-11-18 16:29:28
|
Hi Dan, Peter, Ah, now it gets interesting... Yes, we can use CML do describe a lot of different kinds of chemistry and for each we might want/need to be able to use different kinds of identifiers. But I think that could be done by using the <identifier> tag with appropriate conventions and/or dictionaries. I'm trying to represent structures from QM calculations, therefore I need structural identifiers. And for other applications we could define other conventions or dictionaries. How about this: *<cml* xmlns="http://www.xml-cml.org/schema" convention="convention:molecular" xmlns:convention="http://www.xml-cml.org/convention/" xmlns:structID="http://www.xml-cml.org/dictionary/structualIdentifier/" xmlns:compndID="http://www.xml-cml.org/dictionary/compoundIdentifier/" xmlns:substanceID="ttp://www.xml-cml.org/dictionary/substanceIdentifier/" *>* *<molecule* id="aspirin" spinMultiplicity="1" formalCharge="0"*>* *<identifier* dictRef="structID:InChI"*>*InChI=1S/C9H8O4/c1-6(10)13-8-5-3-2-4-7(8)9(11)12/h2-5H,1H3,(H,11,12)*</identifier**>* *<identifier* dictref="structID:CanonicalSmiles"*>*CC(=O)OC1=CC=CC=C1C(=O)O*</identifier**>* *<identifier* dictref="compndID:pubChemCompound"*>*CID 2244*</identifier**>* *<identifier* dictref="substsID:pubChemSubstance"*>*SID 53788943*</identifier**>* <!-- atomArray, bondArray, etc. --> *<property* dictRef="cml:molmass"*>* *<scalar* dataType="xsd:double" units="unit:dalton" xmlns:unit="http://www.xml-cml.org/unit/si/"*>*180.15742*</scalar**>* *</property**>* *<formula* concise="C 9 H 8 O 4"*/>* *</molecule**>**</cml**>* The dictionaries with a more general applicability could be part of the CML standard and hosted at the CML website and if necessary other very specific dictionaries could even be located somewhere else (e.g. at NIH). What do you think? Best, Oliver On Mon, Nov 17, 2014 at 5:32 PM, Peter Murray-Rust <pm...@ca...> wrote: > Absolutely right Dan, > > We never confuse > > substance with compound with structure. > > P. > > > On Mon, Nov 17, 2014 at 3:46 PM, Zaharevitz, Daniel (NIH/NCI) [E] < > Dan...@ni...> wrote: > >> It has been quite a while, but I did use the <identifier> tag for a lot >> of stuff. I would be interested in participating in reviving it. One reason >> I used it was the tag name implied a distinction from structure. I think >> it is very important to keep the distinction between substance or sample >> identifier distinct from structure. Chemical structure is an empirical >> property of a substance hence subject to change as more data become >> available. Making structure serve as a substance identifier is a major >> mistake and thus I would argue that all the uses in the example listed are >> inappropriate uses of the identifier tag. Of course as Peter says any >> internally consistent use should be able to be changed to any other >> internally consistent use without too much difficulty, but I think using >> structure as an identifier can lead to major problems in maintaining >> internal consistency. >> >> DanZ >> >> -- >> >> /**********************/ >> Daniel Zaharevitz >> Chief, ITB, DTP, DCTD >> National Cancer Institute >> Zah...@ma... >> /**********************/ >> >> From: Peter Murray-Rust <pm...@ca...<mailto:pm...@ca...>> >> Date: Monday, November 17, 2014 3:23 PM >> To: Oliver Stueker <rev...@us...<mailto: >> rev...@us...>> >> Cc: "cml...@li...<mailto: >> cml...@li...>" <cml...@li... >> <mailto:cml...@li...>> >> Subject: Re: [cml/ccml-discuss] Identifiers in CML >> >> Wonderful. >> >> Egon wants to hand on the baton of CML so I'll put you in touch with >> others quite shortly. I'd say do whatever seems reasonable and not too >> complex... >> >> >> Happy to see extensions in principle - suggest what you want to do... >> >> On Mon, Nov 17, 2014 at 3:12 PM, Oliver Stueker < >> rev...@us...<mailto:rev...@us...>> >> wrote: >> Thanks Peter, >> >> My goal is to add SMILES and InChI identifiers to the molecule elements >> in CML/CompChem documents. >> The CompChem convention states that <molecule> elements should conform to >> the molecular convention. >> In the Molecular convention there is a list of allowed child elements, >> of which <property>, <label> and a bit far-fetched <name> come the closest >> to describe an identifier. >> >> I can easily just invent something but I prefer to follow an already >> defined standard or involve with defining and accepted standard. >> >> In the cmllite-validator-code repo I found this CML/CompChem file [1] >> which defines: >> >> >> <cml> >> >> <module role="joblist"> <identifier >> convention="chemid:EmpiricalFormula" value="CCl2O2"/> <identifier >> convention="chemid:InChI" value="InChI=1/CCl2O2/c2-1(4)5-3"/> >> <identifier convention="chemid:CanonicalSmiles" value="ClOC(=O)Cl"/> >> <identifier convention="chemid:IsomericSmiles" value="C(=O)(Cl)OCl"/> >> <module role="job" title="job1"> >> >> <!-- ... --> >> >> </module> >> >> </module> >> >> </cml> >> >> However on http://www.xml-cml.org/convention/ there is no chemid >> convention nor an <identifier> element in any convention. >> >> What do you think of adding <identifier> to the allowed elements of >> <molecule> in the molecular convention and starting a chemid dictionary? >> >> >> In fact I'm currently also working on expanding the CompChem dictionary. >> I've (privately) forked the repo on bitbucket.org/cml/dictionary-compchem >> <http://bitbucket.org/cml/dictionary-compchem> and will propose a pull >> request at a suitable time. >> I could do the same for the molecular convention. >> >> Best, >> Oliver >> >> [1] >> https://bitbucket.org/cml/cmllite-validator-code/src/2eaa18f959bb0324268bf75be8a904d4c9e07944/src/test/resources/org/xmlcml/www/convention/cmlcomp/valid/two-jobs.cml?at=default >> >> On Mon, Nov 17, 2014 at 3:32 PM, Peter Murray-Rust <pm...@ca... >> <mailto:pm...@ca...>> wrote: >> Copying Egon, >> >> We did have an <identifier> label which allowed this, but you can also >> use <label> and add a "class" attribute. >> >> The key approach of CML now is that communities should create conventions >> that work for them. As these become established then conventions can become >> normalised. trying to constrains too rigidly requires a lot of software are >> consistent discipline. >> >> If different communities end up with slightly different approaches it's >> not normally hard to convert or merge. >> >> P. >> >> >> On Mon, Nov 17, 2014 at 8:43 AM, Oliver Stueker < >> rev...@us...<mailto:rev...@us...>> >> wrote: >> Dear CML Community, >> >> >> is there an (official) standard or best practice on how to include one >> or more identifiers (SMILES, InChI, etc.) in a CML document following the >> molecular convention? >> >> Maybe as <label> or <property> children of the <molecule> ? >> >> >> I couldn't find anything in the convention or the cml dictionaries. >> >> >> Best, >> Oliver >> >> >> >> >> -- >> Oliver Stueker, Dr. rer. nat. >> Postdoctoral Fellow, Poirier Lab >> Department of Chemistry, Memorial University >> Room C3052 - phone: +1 (709) 864-8752<tel:%2B1%20%28709%29%20864-8752> >> >> >> ------------------------------------------------------------------------------ >> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server >> from Actuate! Instantly Supercharge Your Business Reports and Dashboards >> with Interactivity, Sharing, Native Excel Exports, App Integration & more >> Get technology previously reserved for billion-dollar corporations, FREE >> >> http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk >> _______________________________________________ >> cml-discuss mailing list >> cml...@li...<mailto: >> cml...@li...> >> https://lists.sourceforge.net/lists/listinfo/cml-discuss >> >> >> >> >> -- >> Peter Murray-Rust >> Reader in Molecular Informatics >> Unilever Centre, Dep. Of Chemistry >> University of Cambridge >> CB2 1EW, UK >> +44-1223-763069<tel:%2B44-1223-763069> >> >> >> >> >> -- >> Peter Murray-Rust >> Reader in Molecular Informatics >> Unilever Centre, Dep. Of Chemistry >> University of Cambridge >> CB2 1EW, UK >> +44-1223-763069 >> > > > > -- > Peter Murray-Rust > Reader in Molecular Informatics > Unilever Centre, Dep. Of Chemistry > University of Cambridge > CB2 1EW, UK > +44-1223-763069 > |
|
From: Peter Murray-R. <pm...@ca...> - 2014-11-17 21:02:35
|
Absolutely right Dan, We never confuse substance with compound with structure. P. On Mon, Nov 17, 2014 at 3:46 PM, Zaharevitz, Daniel (NIH/NCI) [E] < Dan...@ni...> wrote: > It has been quite a while, but I did use the <identifier> tag for a lot of > stuff. I would be interested in participating in reviving it. One reason I > used it was the tag name implied a distinction from structure. I think it > is very important to keep the distinction between substance or sample > identifier distinct from structure. Chemical structure is an empirical > property of a substance hence subject to change as more data become > available. Making structure serve as a substance identifier is a major > mistake and thus I would argue that all the uses in the example listed are > inappropriate uses of the identifier tag. Of course as Peter says any > internally consistent use should be able to be changed to any other > internally consistent use without too much difficulty, but I think using > structure as an identifier can lead to major problems in maintaining > internal consistency. > > DanZ > > -- > > /**********************/ > Daniel Zaharevitz > Chief, ITB, DTP, DCTD > National Cancer Institute > Zah...@ma... > /**********************/ > > From: Peter Murray-Rust <pm...@ca...<mailto:pm...@ca...>> > Date: Monday, November 17, 2014 3:23 PM > To: Oliver Stueker <rev...@us...<mailto: > rev...@us...>> > Cc: "cml...@li...<mailto: > cml...@li...>" <cml...@li... > <mailto:cml...@li...>> > Subject: Re: [cml/ccml-discuss] Identifiers in CML > > Wonderful. > > Egon wants to hand on the baton of CML so I'll put you in touch with > others quite shortly. I'd say do whatever seems reasonable and not too > complex... > > > Happy to see extensions in principle - suggest what you want to do... > > On Mon, Nov 17, 2014 at 3:12 PM, Oliver Stueker < > rev...@us...<mailto:rev...@us...>> > wrote: > Thanks Peter, > > My goal is to add SMILES and InChI identifiers to the molecule elements in > CML/CompChem documents. > The CompChem convention states that <molecule> elements should conform to > the molecular convention. > In the Molecular convention there is a list of allowed child elements, of > which <property>, <label> and a bit far-fetched <name> come the closest to > describe an identifier. > > I can easily just invent something but I prefer to follow an already > defined standard or involve with defining and accepted standard. > > In the cmllite-validator-code repo I found this CML/CompChem file [1] > which defines: > > <cml> > > <module role="joblist"> <identifier > convention="chemid:EmpiricalFormula" value="CCl2O2"/> <identifier > convention="chemid:InChI" value="InChI=1/CCl2O2/c2-1(4)5-3"/> > <identifier convention="chemid:CanonicalSmiles" value="ClOC(=O)Cl"/> > <identifier convention="chemid:IsomericSmiles" value="C(=O)(Cl)OCl"/> > <module role="job" title="job1"> > > <!-- ... --> > > </module> > > </module> > > </cml> > > However on http://www.xml-cml.org/convention/ there is no chemid > convention nor an <identifier> element in any convention. > > What do you think of adding <identifier> to the allowed elements of > <molecule> in the molecular convention and starting a chemid dictionary? > > > In fact I'm currently also working on expanding the CompChem dictionary. > I've (privately) forked the repo on bitbucket.org/cml/dictionary-compchem< > http://bitbucket.org/cml/dictionary-compchem> and will propose a pull > request at a suitable time. > I could do the same for the molecular convention. > > Best, > Oliver > > [1] > https://bitbucket.org/cml/cmllite-validator-code/src/2eaa18f959bb0324268bf75be8a904d4c9e07944/src/test/resources/org/xmlcml/www/convention/cmlcomp/valid/two-jobs.cml?at=default > > On Mon, Nov 17, 2014 at 3:32 PM, Peter Murray-Rust <pm...@ca... > <mailto:pm...@ca...>> wrote: > Copying Egon, > > We did have an <identifier> label which allowed this, but you can also use > <label> and add a "class" attribute. > > The key approach of CML now is that communities should create conventions > that work for them. As these become established then conventions can become > normalised. trying to constrains too rigidly requires a lot of software are > consistent discipline. > > If different communities end up with slightly different approaches it's > not normally hard to convert or merge. > > P. > > > On Mon, Nov 17, 2014 at 8:43 AM, Oliver Stueker < > rev...@us...<mailto:rev...@us...>> > wrote: > Dear CML Community, > > > is there an (official) standard or best practice on how to include one or > more identifiers (SMILES, InChI, etc.) in a CML document following the > molecular convention? > > Maybe as <label> or <property> children of the <molecule> ? > > > I couldn't find anything in the convention or the cml dictionaries. > > > Best, > Oliver > > > > > -- > Oliver Stueker, Dr. rer. nat. > Postdoctoral Fellow, Poirier Lab > Department of Chemistry, Memorial University > Room C3052 - phone: +1 (709) 864-8752<tel:%2B1%20%28709%29%20864-8752> > > > ------------------------------------------------------------------------------ > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > from Actuate! Instantly Supercharge Your Business Reports and Dashboards > with Interactivity, Sharing, Native Excel Exports, App Integration & more > Get technology previously reserved for billion-dollar corporations, FREE > > http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk > _______________________________________________ > cml-discuss mailing list > cml...@li...<mailto:cml...@li... > > > https://lists.sourceforge.net/lists/listinfo/cml-discuss > > > > > -- > Peter Murray-Rust > Reader in Molecular Informatics > Unilever Centre, Dep. Of Chemistry > University of Cambridge > CB2 1EW, UK > +44-1223-763069<tel:%2B44-1223-763069> > > > > > -- > Peter Murray-Rust > Reader in Molecular Informatics > Unilever Centre, Dep. Of Chemistry > University of Cambridge > CB2 1EW, UK > +44-1223-763069 > -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069 |
|
From: Peter Murray-R. <pm...@ca...> - 2014-11-17 20:23:32
|
Wonderful. Egon wants to hand on the baton of CML so I'll put you in touch with others quite shortly. I'd say do whatever seems reasonable and not too complex... Happy to see extensions in principle - suggest what you want to do... On Mon, Nov 17, 2014 at 3:12 PM, Oliver Stueker < rev...@us...> wrote: > Thanks Peter, > > My goal is to add SMILES and InChI identifiers to the molecule elements in > CML/CompChem documents. > The CompChem convention states that <molecule> elements should conform to > the molecular convention. > In the Molecular convention there is a list of allowed child elements, of > which <property>, <label> and a bit far-fetched <name> come the closest to > describe an identifier. > > I can easily just invent something but I prefer to follow an already > defined standard or involve with defining and accepted standard. > > In the cmllite-validator-code repo I found this CML/CompChem file [1] > which defines: > > <cml> > > <module role="joblist"> <identifier convention="chemid:EmpiricalFormula" value="CCl2O2"/> <identifier convention="chemid:InChI" value="InChI=1/CCl2O2/c2-1(4)5-3"/> <identifier convention="chemid:CanonicalSmiles" value="ClOC(=O)Cl"/> <identifier convention="chemid:IsomericSmiles" value="C(=O)(Cl)OCl"/> <module role="job" title="job1"> > > <!-- ... --> > > </module> > > </module> > > </cml> > > > However on http://www.xml-cml.org/convention/ there is no chemid > convention nor an <identifier> element in any convention. > > What do you think of adding <identifier> to the allowed elements of > <molecule> in the molecular convention and starting a chemid dictionary? > > > In fact I'm currently also working on expanding the CompChem dictionary. > I've (privately) forked the repo on bitbucket.org/cml/dictionary-compchem > and will propose a pull request at a suitable time. > I could do the same for the molecular convention. > > Best, > Oliver > > [1] > https://bitbucket.org/cml/cmllite-validator-code/src/2eaa18f959bb0324268bf75be8a904d4c9e07944/src/test/resources/org/xmlcml/www/convention/cmlcomp/valid/two-jobs.cml?at=default > > On Mon, Nov 17, 2014 at 3:32 PM, Peter Murray-Rust <pm...@ca...> > wrote: > >> Copying Egon, >> >> We did have an <identifier> label which allowed this, but you can also >> use <label> and add a "class" attribute. >> >> The key approach of CML now is that communities should create conventions >> that work for them. As these become established then conventions can become >> normalised. trying to constrains too rigidly requires a lot of software are >> consistent discipline. >> >> If different communities end up with slightly different approaches it's >> not normally hard to convert or merge. >> >> P. >> >> >> On Mon, Nov 17, 2014 at 8:43 AM, Oliver Stueker < >> rev...@us...> wrote: >> >>> Dear CML Community, >>> >>> >>> is there an (official) standard or best practice on how to include one >>> or more identifiers (SMILES, InChI, etc.) in a CML document following the >>> molecular convention? >>> >>> Maybe as <label> or <property> children of the <molecule> ? >>> >>> >>> I couldn't find anything in the convention or the cml dictionaries. >>> >>> >>> Best, >>> Oliver >>> >>> >>> >>> >>> -- >>> Oliver Stueker, Dr. rer. nat. >>> Postdoctoral Fellow, Poirier Lab >>> Department of Chemistry, Memorial University >>> Room C3052 - phone: +1 (709) 864-8752 >>> >>> >>> ------------------------------------------------------------------------------ >>> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server >>> from Actuate! Instantly Supercharge Your Business Reports and Dashboards >>> with Interactivity, Sharing, Native Excel Exports, App Integration & more >>> Get technology previously reserved for billion-dollar corporations, FREE >>> >>> http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> cml-discuss mailing list >>> cml...@li... >>> https://lists.sourceforge.net/lists/listinfo/cml-discuss >>> >>> >> >> >> -- >> Peter Murray-Rust >> Reader in Molecular Informatics >> Unilever Centre, Dep. Of Chemistry >> University of Cambridge >> CB2 1EW, UK >> +44-1223-763069 >> > > -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069 |
|
From: Oliver S. <rev...@us...> - 2014-11-17 20:12:24
|
Thanks Peter,
My goal is to add SMILES and InChI identifiers to the molecule elements in
CML/CompChem documents.
The CompChem convention states that <molecule> elements should conform to
the molecular convention.
In the Molecular convention there is a list of allowed child elements, of
which <property>, <label> and a bit far-fetched <name> come the closest to
describe an identifier.
I can easily just invent something but I prefer to follow an already
defined standard or involve with defining and accepted standard.
In the cmllite-validator-code repo I found this CML/CompChem file [1] which
defines:
<cml>
<module role="joblist"> <identifier
convention="chemid:EmpiricalFormula" value="CCl2O2"/> <identifier
convention="chemid:InChI" value="InChI=1/CCl2O2/c2-1(4)5-3"/>
<identifier convention="chemid:CanonicalSmiles" value="ClOC(=O)Cl"/>
<identifier convention="chemid:IsomericSmiles" value="C(=O)(Cl)OCl"/>
<module role="job" title="job1">
<!-- ... -->
</module>
</module>
</cml>
However on http://www.xml-cml.org/convention/ there is no chemid convention
nor an <identifier> element in any convention.
What do you think of adding <identifier> to the allowed elements of
<molecule> in the molecular convention and starting a chemid dictionary?
In fact I'm currently also working on expanding the CompChem dictionary.
I've (privately) forked the repo on bitbucket.org/cml/dictionary-compchem
and will propose a pull request at a suitable time.
I could do the same for the molecular convention.
Best,
Oliver
[1]
https://bitbucket.org/cml/cmllite-validator-code/src/2eaa18f959bb0324268bf75be8a904d4c9e07944/src/test/resources/org/xmlcml/www/convention/cmlcomp/valid/two-jobs.cml?at=default
On Mon, Nov 17, 2014 at 3:32 PM, Peter Murray-Rust <pm...@ca...> wrote:
> Copying Egon,
>
> We did have an <identifier> label which allowed this, but you can also use
> <label> and add a "class" attribute.
>
> The key approach of CML now is that communities should create conventions
> that work for them. As these become established then conventions can become
> normalised. trying to constrains too rigidly requires a lot of software are
> consistent discipline.
>
> If different communities end up with slightly different approaches it's
> not normally hard to convert or merge.
>
> P.
>
>
> On Mon, Nov 17, 2014 at 8:43 AM, Oliver Stueker <
> rev...@us...> wrote:
>
>> Dear CML Community,
>>
>>
>> is there an (official) standard or best practice on how to include one
>> or more identifiers (SMILES, InChI, etc.) in a CML document following the
>> molecular convention?
>>
>> Maybe as <label> or <property> children of the <molecule> ?
>>
>>
>> I couldn't find anything in the convention or the cml dictionaries.
>>
>>
>> Best,
>> Oliver
>>
>>
>>
>>
>> --
>> Oliver Stueker, Dr. rer. nat.
>> Postdoctoral Fellow, Poirier Lab
>> Department of Chemistry, Memorial University
>> Room C3052 - phone: +1 (709) 864-8752
>>
>>
>> ------------------------------------------------------------------------------
>> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
>> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
>> with Interactivity, Sharing, Native Excel Exports, App Integration & more
>> Get technology previously reserved for billion-dollar corporations, FREE
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk
>> _______________________________________________
>> cml-discuss mailing list
>> cml...@li...
>> https://lists.sourceforge.net/lists/listinfo/cml-discuss
>>
>>
>
>
> --
> Peter Murray-Rust
> Reader in Molecular Informatics
> Unilever Centre, Dep. Of Chemistry
> University of Cambridge
> CB2 1EW, UK
> +44-1223-763069
>
|
|
From: Peter Murray-R. <pm...@ca...> - 2014-11-17 19:02:48
|
Copying Egon, We did have an <identifier> label which allowed this, but you can also use <label> and add a "class" attribute. The key approach of CML now is that communities should create conventions that work for them. As these become established then conventions can become normalised. trying to constrains too rigidly requires a lot of software are consistent discipline. If different communities end up with slightly different approaches it's not normally hard to convert or merge. P. On Mon, Nov 17, 2014 at 8:43 AM, Oliver Stueker < rev...@us...> wrote: > Dear CML Community, > > > is there an (official) standard or best practice on how to include one or > more identifiers (SMILES, InChI, etc.) in a CML document following the > molecular convention? > > Maybe as <label> or <property> children of the <molecule> ? > > > I couldn't find anything in the convention or the cml dictionaries. > > > Best, > Oliver > > > > > -- > Oliver Stueker, Dr. rer. nat. > Postdoctoral Fellow, Poirier Lab > Department of Chemistry, Memorial University > Room C3052 - phone: +1 (709) 864-8752 > > > ------------------------------------------------------------------------------ > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server > from Actuate! Instantly Supercharge Your Business Reports and Dashboards > with Interactivity, Sharing, Native Excel Exports, App Integration & more > Get technology previously reserved for billion-dollar corporations, FREE > > http://pubads.g.doubleclick.net/gampad/clk?id=157005751&iu=/4140/ostg.clktrk > _______________________________________________ > cml-discuss mailing list > cml...@li... > https://lists.sourceforge.net/lists/listinfo/cml-discuss > > -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069 |
|
From: Oliver S. <rev...@us...> - 2014-11-17 13:43:43
|
Dear CML Community, is there an (official) standard or best practice on how to include one or more identifiers (SMILES, InChI, etc.) in a CML document following the molecular convention? Maybe as <label> or <property> children of the <molecule> ? I couldn't find anything in the convention or the cml dictionaries. Best, Oliver -- Oliver Stueker, Dr. rer. nat. Postdoctoral Fellow, Poirier Lab Department of Chemistry, Memorial University Room C3052 - phone: +1 (709) 864-8752 |
|
From: Egon W. <ego...@gm...> - 2012-07-21 06:50:47
|
On Sat, Jul 21, 2012 at 12:19 AM, Peter Murray-Rust <pm...@ca...> wrote: > We are expanding the group of people looking after CML and as a result are > planning to move the mailing list to a GoogleGroups. Before doing so, does > anyone have any objection? No objections at all. Egon -- Dr E.L. Willighagen Postdoctoral Researcher Department of Bioinformatics - BiGCaT Maastricht University (http://www.bigcat.unimaas.nl/) Homepage: http://egonw.github.com/ LinkedIn: http://se.linkedin.com/in/egonw Blog: http://chem-bla-ics.blogspot.com/ PubList: http://www.citeulike.org/user/egonw/tag/papers |
|
From: Peter Murray-R. <pm...@ca...> - 2012-07-20 22:19:11
|
We are expanding the group of people looking after CML and as a result are planning to move the mailing list to a GoogleGroups. Before doing so, does anyone have any objection? Peter -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069 |
|
From: Dmitry P. <dp...@gg...> - 2011-04-10 18:46:03
|
Dear Peter, > 1. AAM numbers in a separate section referenced by > "atomMap". This means the cases where USE_IDS is > not sufficient. > > Please explain in more detail - are these Marvin concepts? No, Marvin does not seem to support AAM in CML at all. I read about atomMap and USE_IDS in the CML 2.4 schema on the xml-cml.org site. xml-cml.org is currently down, so here is a link to another site where this is described: http://www.wipo.int/standards/XMLSchema/HTML/ST96TechnicalSpecification/V0-2/Patent/PatentSchemaTechnicalSpecification_atomMap.html > Reaction centers can be done in CML. There are different approaches to > reactions and I'd be interested in knowing what you wish to support. We would like to support special flags on bonds involved (or not) in the reaction. This concept came to Indigo from the Rxnfile format. I am copy-pasting the constants which describe the types of the reacting centers from Indigo code: RC_NOT_CENTER = -1, RC_UNMARKED = 0, RC_CENTER = 1, RC_UNCHANGED = 2, RC_MADE_OR_BROKEN = 4, RC_ORDER_CHANGED = 8, Combinations (like "made or broken or order changes") are also possible. You can read the original Rxnfile document here: http://www.symyx.com/downloads/public/ctfile/ctfile.pdf (search for RXCTR) > polystyrene is described in > http://pubs.acs.org/doi/abs/10.1021/ci8002123 - I will hope to post this > Openly soon. That would be great, thanks! Best regards, Dmitry |
|
From: Peter Murray-R. <pm...@ca...> - 2011-04-10 09:20:06
|
On Sat, Apr 9, 2011 at 11:05 PM, Dmitry Pavlov <dp...@gg...>wrote: > Hello, > > I was wondering where I can get any examples on > the following features of CML. Also, links to software > (chemical editors/converters) that support these > features would be very useful. I tried MarvinSketch -- > it does not seem to do any of these well. > Thanks very much. I can't answer for Marvin Sketch - it's a commercial product and I don't have the code or docs. I have recently corresponded with them about how they can emit compliant CML. > > 1. AAM numbers in a separate section referenced by > "atomMap". This means the cases where USE_IDS is > not sufficient. > > Please explain in more detail - are these Marvin concepts? > 2. Reaction centers (known from Rxnfile notation). > I was not able to find them in the CML 2.4 spec... > Reaction centers can be done in CML. There are different approaches to reactions and I'd be interested in knowing what you wish to support. It could either be through a series of generic atoms (using elementType="R") or through a cml:map element or both. > > 3. Polymers. I read about the headType and tailType, > but it would be more understandable with an example. > What is, for instance, the CML representation for > polystyrene? > polystyrene is described in http://pubs.acs.org/doi/abs/10.1021/ci8002123 - I will hope to post this Openly soon. > > > Thanks very much. I would appreciate any pointers. > > This is great. CML is a flexible langauage and there ar different styles of doing some things. We are now encapsulating these as "conventions" - see for examples http://www.xml-cml.org where we have developed conventions for dictionaries and molecules. This is now the type to do reactions. I am delighted to see that you offer Open Source software and that - within reason - would be happy to help with showing how this can be made valid within one or more CML validation frameworks. P. > Best regards, > > Dmitry > > > ------------------------------------------------------------------------------ > Xperia(TM) PLAY > It's a major breakthrough. An authentic gaming > smartphone on the nation's most reliable network. > And it wants your games. > http://p.sf.net/sfu/verizon-sfdev > _______________________________________________ > cml-discuss mailing list > cml...@li... > https://lists.sourceforge.net/lists/listinfo/cml-discuss > -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069 |
|
From: Dmitry P. <dp...@gg...> - 2011-04-09 21:57:57
|
Hello, I was wondering where I can get any examples on the following features of CML. Also, links to software (chemical editors/converters) that support these features would be very useful. I tried MarvinSketch -- it does not seem to do any of these well. 1. AAM numbers in a separate section referenced by "atomMap". This means the cases where USE_IDS is not sufficient. 2. Reaction centers (known from Rxnfile notation). I was not able to find them in the CML 2.4 spec... 3. Polymers. I read about the headType and tailType, but it would be more understandable with an example. What is, for instance, the CML representation for polystyrene? Thanks very much. I would appreciate any pointers. Best regards, Dmitry |
|
From: Peter Murray-R. <pm...@ca...> - 2010-11-15 14:32:19
|
On Mon, Nov 15, 2010 at 1:55 PM, Dmitry Pavlov <dp...@gg...>wrote: > Hello Peter, > > Thank you for your answer. > > The origin of my question was primarily the support > of the CML format in the "inchi-1" utility. > Basically, I need to process structures represented > as Molfiles and (the same) structures represented > as CML files and I need to be sure that the results > will match. > The initial choice of Molfiles as the entry tool for InChI was - I think - unfortunate. The wedge/hatch bonds have some disadvantages: * they have no meaning when there are no coordinates * it is possible to draw structures where the stereochemistry is ambiguous or contradictory (e.g. a edge and hatch trans to each other on carbons) * there was an attempt to use the bonds as descriptions of perspective. It is almost impossible for a computer (and for many humans) to interpret perspective and it is often wrong. Therefore I regard wedge and hatch bonds as potentially misleading. SMILES generally got the right idea - with atom chirality @@ and bond stereo /\ unreal;ted to coordinates > So far I have fed Molfiles to this utility. > As far as I have learned, it does recognize > "undefined" atom stereo if no stereo bonds are drawn, > and marking a bond as "either" has no effect -- the > InChI code is the same, and will be the same for > the CML file with unmarked stereo bonds. > So yes, "either" bonds are not useful in this case. > However, there are other cases: > > 1) Representing undefined stereo in 3D structures. > The InChI Manual says: "[...] (wavy) bonds in the > 3-dimensional case still provide “unknown” stereochemistry > even if the coordinates allow calculation of the sp > parity. I wonder what is your opinion about this: > maybe the "atomParity=0" CML property means the > same in the 3D case? I am not really sure. > This is a difficult area. As an example what is Harry Truman's middle name? "Harry Truman" could mean: * he has no middle name * he has a middle name but has omitted to give it * he has a middle name but it is not publicly known (Actually he has a middle initial S but no middle name - http://en.wikipedia.org/wiki/Harry_S_Truman ) The point is that different people can read different things into the absence of something. Chemists use this in their language about sterechemistry if atomParity is 0 then one interpretation is that the carbon or other atom is flat. CML will attempt to replicate what the community agrees - I am not yet sure whether there is agreement. Even atomParity = 1 may only mean that the relative stereochemistry in a molecule is known. We don't have a good language for describing racemic mixtures and mixtures of diastereomers > > 2) Connecting "either" bonds to a stereoatom > that already has "up" and/or "down" bonds > connected discards the effect of these bonds > and makes the stereo configuration undefined. > I am not sure that this use case makes any sense, > though. > I agree that it is unclear > > 3) An "either" bond connected to a cis-trans > double bond makes this bond "undefined cis-trans". > This makes perfect sense for Molfiles, and I > wonder if there is a way (for instance, > setting "bondStereo=0" or otherwise) to represent > "undefined cis-trans" bond for 2D structures in > the CML format. > > If you omit it then there is the same implict statement that you know nothing about it. The problem with double bonds is that you cannot give neutral information with a plane diagram - it implies some sort of stereochemistry whereas atom centres do not > With best regards, > > Dmitry Pavlov > > > This is an important conversationb. If there are clear IUPAC guidelines that CML can support then we can try. P. -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069 |
|
From: Dmitry P. <dp...@gg...> - 2010-11-15 13:53:34
|
Hello Peter, Thank you for your answer. The origin of my question was primarily the support of the CML format in the "inchi-1" utility. Basically, I need to process structures represented as Molfiles and (the same) structures represented as CML files and I need to be sure that the results will match. So far I have fed Molfiles to this utility. As far as I have learned, it does recognize "undefined" atom stereo if no stereo bonds are drawn, and marking a bond as "either" has no effect -- the InChI code is the same, and will be the same for the CML file with unmarked stereo bonds. So yes, "either" bonds are not useful in this case. However, there are other cases: 1) Representing undefined stereo in 3D structures. The InChI Manual says: "[...] (wavy) bonds in the 3-dimensional case still provide “unknown” stereochemistry even if the coordinates allow calculation of the sp parity. I wonder what is your opinion about this: maybe the "atomParity=0" CML property means the same in the 3D case? I am not really sure. 2) Connecting "either" bonds to a stereoatom that already has "up" and/or "down" bonds connected discards the effect of these bonds and makes the stereo configuration undefined. I am not sure that this use case makes any sense, though. 3) An "either" bond connected to a cis-trans double bond makes this bond "undefined cis-trans". This makes perfect sense for Molfiles, and I wonder if there is a way (for instance, setting "bondStereo=0" or otherwise) to represent "undefined cis-trans" bond for 2D structures in the CML format. With best regards, Dmitry Pavlov On 11/14/2010 04:08 AM, Peter Murray-Rust wrote: > > > On Sat, Nov 13, 2010 at 11:36 PM, Dmitry Pavlov <dp...@gg... > <mailto:dp...@gg...>> wrote: > > Hello all, > > I am wondering if the CML format has (or will have) > the possibility to encode the "wavy" single bonds, > which usually mean "either" stereo configuration > on an atom? > > Thanks for the question and interest. > CML is designed primarily to represent precise knowledge of the > chemistry. The concept of "either" is not universally agreed. It could > mean "unknown", or "a mixture of compunds" or (as often happens) not > recorded (which is not the same as unknown). Itat could mean "we know > that there is only one configuration at this stereocentre but we don't > know what it is". I don't believe there is agreement in the community > about the semantics alythough the IUPAC group has been working on it. > If you mean "both" then you are describing a mixture of compounds. The > convention that mixtures can be encoded by fuzzy bonds is not easy to > support with software. It is best described by using symbols and we have > done this for some systems and if the community agrees on > stereochemistry may use this approach. (It is not esy to describe a > mixture of diasteomers with several centres). > The concept of "unknown" can be modelled by omitting the information. > (JUMBO would set this to null). There are many other cases where > information can be omitted to the same effect. > You are allowed to create your own conventions as long as they are > labelled with the @convention attribute - this could be used for bondStereo. > the idea of a convention is that everyone agrees it and builds software > that processes it in the same way. If this is actually the case - and > there are validating systems then a case can be made. > please mail again if this is not clear. > > > Best regards, > > Dmitry Pavlov > > ------------------------------------------------------------------------------ > Centralized Desktop Delivery: Dell and VMware Reference Architecture > Simplifying enterprise desktop deployment and management using > Dell EqualLogic storage and VMware View: A highly scalable, end-to-end > client virtualization framework. Read more! > http://p.sf.net/sfu/dell-eql-dev2dev > _______________________________________________ > cml-discuss mailing list > cml...@li... > <mailto:cml...@li...> > https://lists.sourceforge.net/lists/listinfo/cml-discuss > > > > > -- > Peter Murray-Rust > Reader in Molecular Informatics > Unilever Centre, Dep. Of Chemistry > University of Cambridge > CB2 1EW, UK > +44-1223-763069 |
|
From: Peter Murray-R. <pm...@ca...> - 2010-11-14 01:09:15
|
On Sat, Nov 13, 2010 at 11:36 PM, Dmitry Pavlov <dp...@gg...>wrote: > Hello all, > > I am wondering if the CML format has (or will have) > the possibility to encode the "wavy" single bonds, > which usually mean "either" stereo configuration > on an atom? > Thanks for the question and interest. CML is designed primarily to represent precise knowledge of the chemistry. The concept of "either" is not universally agreed. It could mean "unknown", or "a mixture of compunds" or (as often happens) not recorded (which is not the same as unknown). Itat could mean "we know that there is only one configuration at this stereocentre but we don't know what it is". I don't believe there is agreement in the community about the semantics alythough the IUPAC group has been working on it. If you mean "both" then you are describing a mixture of compounds. The convention that mixtures can be encoded by fuzzy bonds is not easy to support with software. It is best described by using symbols and we have done this for some systems and if the community agrees on stereochemistry may use this approach. (It is not esy to describe a mixture of diasteomers with several centres). The concept of "unknown" can be modelled by omitting the information. (JUMBO would set this to null). There are many other cases where information can be omitted to the same effect. You are allowed to create your own conventions as long as they are labelled with the @convention attribute - this could be used for bondStereo. the idea of a convention is that everyone agrees it and builds software that processes it in the same way. If this is actually the case - and there are validating systems then a case can be made. please mail again if this is not clear. > > Best regards, > > Dmitry Pavlov > > > ------------------------------------------------------------------------------ > Centralized Desktop Delivery: Dell and VMware Reference Architecture > Simplifying enterprise desktop deployment and management using > Dell EqualLogic storage and VMware View: A highly scalable, end-to-end > client virtualization framework. Read more! > http://p.sf.net/sfu/dell-eql-dev2dev > _______________________________________________ > cml-discuss mailing list > cml...@li... > https://lists.sourceforge.net/lists/listinfo/cml-discuss > -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069 |
|
From: Dmitry P. <dp...@gg...> - 2010-11-13 23:46:43
|
Hello all, I am wondering if the CML format has (or will have) the possibility to encode the "wavy" single bonds, which usually mean "either" stereo configuration on an atom? Best regards, Dmitry Pavlov |
|
From: Egon W. <ego...@gm...> - 2010-07-06 05:49:49
|
Hi all, someone asked about the up to date CMLReact schema on the Blue Obelisk eXchange: http://blueobelisk.shapado.com/questions/where-to-find-cmlreact-updated-schema Egon -- Post-doc @ Uppsala University Proteochemometrics / Bioclipse Group of Prof. Jarl Wikberg Homepage: http://egonw.github.com/ Blog: http://chem-bla-ics.blogspot.com/ PubList: http://www.citeulike.org/user/egonw/tag/papers |
|
From: Hong C. <ho...@eb...> - 2010-06-14 12:46:47
|
Hello! We are planning to use the CMLReact 1.0 schema to encode the response the EBI's Rhea Web Service. I downloaded the schema from http://cml.sourceforge.net/ where it said " This schema is about to be submitted for publication. It is assigned the namespace http://www.xml-cml.org/schema/cml2/react" while in the xsd file the target namespace is declared to be http://www.xml-cml.org/schema/cml2/core. When I looked into the CMLreact example file, the default name space is http://www.xml-cml.org/schema/cml2/core and there is no reference to the cmlreact schema. I'd like to know if the cmlreact schema e example files in sourceforge are up to date? Best regards, Hong Cao |
|
From: Peter Murray-R. <pm...@ca...> - 2010-05-10 20:48:06
|
On Mon, May 10, 2010 at 7:26 PM, Konstantin Tokarev <an...@ya...> wrote: > Hello everyone! > Some time ago I've asked if there's some agreement on saving these > properties in CML files: > You can store any scalar quantity you like in CML files, using your own attributes and elements. For example: <atom xmlns:kt="http://yandex.ru/annulen" kt:color="blue"/> This will not clash with anyone else - you own the namespace. Then you should work with the BlueObelisk to see if you can find otehrs interested in synthesising the approaches. You might then develop a chemicalColourML which the community feels is worth adopting. The technology is simple - it's getting general agreement that is the challenge. > * Custom label of atom (independent from elementType) > * Custom color of atom (RGB) > * Background color of scene (maybe including gradient filling description) > * Custom atomic radius (in Angstroms) > * Custom directions of cartesian axes CML has tools for coordinates and vectors. > > AFAIU, such agreemnt doesn't exist. I think it's very important to > develop agreement or dictionary to make opensource > chemical software more interoperable. I agree > > Any suggestions? Create some examples - and software that processes them. See what Jmol does. Mail on the BO list. Keep trying. It will be hard work but possible. P. -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069 |
|
From: Konstantin T. <an...@ya...> - 2010-05-10 18:26:51
|
Hello everyone! Some time ago I've asked if there's some agreement on saving these properties in CML files: * Custom label of atom (independent from elementType) * Custom color of atom (RGB) * Background color of scene (maybe including gradient filling description) * Custom atomic radius (in Angstroms) * Custom directions of cartesian axes AFAIU, such agreemnt doesn't exist. I think it's very important to develop agreement or dictionary to make opensource chemical software more interoperable. Any suggestions? -- Regards, Konstantin |
|
From: Peter Murray-R. <pm...@ca...> - 2010-03-21 15:20:42
|
On Sat, Mar 20, 2010 at 8:52 AM, Egon Willighagen < ego...@gm...> wrote: > Hi Sam, > > On Fri, Mar 12, 2010 at 4:09 PM, Sam Adams <se...@ca...> wrote: > > <atom elementType="O"> > > <atomType dictRef="cml:mol2">O.co2</atomType> > > <atomType dictRef="cml:mmff94">O=CO</atomType> > > </atom> > > I always understood that the dictRef was a deep link... not pointing > to a particular dictionary, but to the matching entry in the > dictionary... I would have expected something like: > > <atom elementType="O"> > <atomType dictRef="mol2:Oco2">O.co2</atomType> > <atomType dictRef="mmff94:Oco2">O=CO</atomType> > </atom> > Essentially the dictRef value (mol2:bar) is equivalent to a URI http://www.foo.com/mol2#bar The world is split between whether a URI is purely a name or whether it is also an address. I was inducted to the W3C philosophy that names and addresses are separate. Tim wishes to conflate them. I am now relaxed about this. So you can interpret <atomType xmlns:mmff94="http://mmff94.org/dict" dictRef="mmff94:Oco2">O=CO</atomType> either as a statement that "there is a defined uniqueId in the mmff94 namespace (http://mmff94.org/dict) with value Oco2. There may or may not be an accessible dictionary entry but there should be at least the concept of one" or "there is a dictionary at http://mmff94.org/dict with an entry http://mmff94.org/dict#Oco2 and this is of the form <cml:entry id="Oco2">...</cml:entry> " I think the the latter is most useful if we can manage it. I absolutely agree that the BO should try to support and systematize this. All the Chem4Word material (code, schemas, dictionaries) etc. will be Open Source/Data/Standard. They may not always be robust but it's best endeavour. Where there are existing BO dictionaries then Chem4Word will be informed by them. P. -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069 |
|
From: Peter Murray-R. <pm...@ca...> - 2010-03-21 14:03:21
|
Thanks Andrew! I stand corrected and that's an excellent exposition. -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069 |
|
From: Peter Murray-R. <pm...@ca...> - 2010-03-21 14:00:55
|
On Sun, Mar 21, 2010 at 1:25 PM, Konstantin Tokarev <an...@ya...>wrote: > OK. More general question: what is profit of dictionaries? > > XML has it's own "dictionaries": dtd, xml schema. But you actually create > new language on top of XML which complicates readability not only by humans, > but by programs too. Why not to keep things simple? > Beacuse it is not simple to represent science to a computer! It's easy to write: dipole="1.2" "everyone knows" that this is a float and that the units are Debye. But machines don't know. To them it's the same as: version="1.2" So at this stage we have to indicate the dataType and the units or we have to guess. in CML we don't guess - we make it explicit. what does "dipole" mean. Does it mean the absolute magnitude of the dipole. Probably, but not certainly. What does: aromatic="true" mean? unless you have an algorithm defining "aromatic" different people will use different definitions. and so on The dictionaries are isomorphic with RDF and ontologies - indeed it's possible to transform CML+dictRef into RDF+ontologies algorithmically. RDF and ontologies are verbose and not very human-readable but they are the best the world has got. P. > > -- > Regards, > Konstantin > -- Peter Murray-Rust Reader in Molecular Informatics Unilever Centre, Dep. Of Chemistry University of Cambridge CB2 1EW, UK +44-1223-763069 |
|
From: Konstantin T. <an...@ya...> - 2010-03-21 13:25:45
|
OK. More general question: what is profit of dictionaries? XML has it's own "dictionaries": dtd, xml schema. But you actually create new language on top of XML which complicates readability not only by humans, but by programs too. Why not to keep things simple? -- Regards, Konstantin |