From: Rajarshi G. <rg...@in...> - 2006-09-24 04:04:12
|
Hi, I'm slightly worried about atomic data like polarizabilities, electronegativies and so on. I was looking at Todd's descriptor code and he is using these atomic property values, for example polarizability, derived from DRAGON. The BPol descriptor that is currently in the CDK uses a set of polarizability values whose source I don't know. There is also a Polarizability class that calculates polarizabilities. So which one should be used? I can see the value of having the Polarizability class. But for fixed polarizability values (like in BPol or Todd's descriptor code) we need a common set of values. So my proposal is that 1. we use BO data for polarizability and electronegativity values that are not calculated 2. the Isotope class be modified to provide methods: getPolarizability() getElectronegativity() Now I see that there are two config files: chemicalElements.xml and isotopes.xml Are these derived from BO data? I assume not. If we want to be consistent it makes sense to rework these using BO data. Moreover, where would specific data go? So for example, the BODR has data for radii, electronegativity (but not polarizability). Would these go into isotopes? elements? I would actually think this data should go into elements rather than isotope. Opinions? ------------------------------------------------------------------- Rajarshi Guha <rg...@in...> GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE ------------------------------------------------------------------- Eureka! -- Archimedes |
From: Egon W. <e.w...@sc...> - 2006-09-24 07:00:24
|
On Sunday 24 September 2006 06:04, Rajarshi Guha wrote: > Hi, I'm slightly worried about atomic data like polarizabilities, > electronegativies and so on. > > I was looking at Todd's descriptor code and he is using these atomic > property values, for example polarizability, derived from DRAGON. > > The BPol descriptor that is currently in the CDK uses a set of > polarizability values whose source I don't know. You can file that as a bug. > There is also a Polarizability class that calculates polarizabilities. > > So which one should be used? CDK does allow for alternatives... I do not think we have a agreed-upon way of marking alternatives. Just @see would be an option, but not that semantically rich... pointing to a dictionary might be a good option here, because we could then have an reasoner summarize such things. > I can see the value of having the Polarizability class. But for fixed > polarizability values (like in BPol or Todd's descriptor code) we need a > common set of values. > > So my proposal is that > > 1. we use BO data for polarizability and electronegativity values that > are not calculated Yes, I think that that's the way forward too. > 2. the Isotope class be modified to provide methods: > > getPolarizability() > getElectronegativity() Are these properties of an isotope? Not of an element or atom type? > Now I see that there are two config files: chemicalElements.xml and > isotopes.xml > > Are these derived from BO data? I assume not. If we want to be > consistent it makes sense to rework these using BO data. Moreover, where > would specific data go? Yes, that's been a project management task for some time now. > So for example, the BODR has data for radii, electronegativity (but not > polarizability). Would these go into isotopes? elements? > > I would actually think this data should go into elements rather than > isotope. Yes, likely. Integration of BO data is important. I did not have time for this yet, so I filed the PMT. Egon -- e.w...@sc... Cologne University Bioinformatics Center (CUBIC) Blog: http://chem-bla-ics.blogspot.com/ GPG: 1024D/D6336BA6 |
From: Rajarshi G. <rg...@in...> - 2006-09-25 00:52:36
|
On Sun, 2006-09-24 at 08:57 +0200, Egon Willighagen wrote: > On Sunday 24 September 2006 06:04, Rajarshi Guha wrote: > > Now I see that there are two config files: chemicalElements.xml and > > isotopes.xml > > > > Are these derived from BO data? I assume not. If we want to be > > consistent it makes sense to rework these using BO data. Moreover, where > > would specific data go? > > Yes, that's been a project management task for some time now. > > > So for example, the BODR has data for radii, electronegativity (but not > > polarizability). Would these go into isotopes? elements? > > > > I would actually think this data should go into elements rather than > > isotope. > > Yes, likely. > > Integration of BO data is important. I did not have time for this yet, so I > filed the PMT. I was looking at the BODR and elements.xml in particular. >From what I understand, integrating the element data from the BODR will require some reworking of the ElementPTHandler class. As a result of using the BODR data it seems that it'd be a good idea to add some more methods to Element (or PeriodicTableElement ?): getDensity() getElectronAffinity() getElectronegativity() getBP() getMP() Does this sound OK? Are there any major issues of updating chemicalElements.xml that I am missing? The result is that code that use it's own fixed values of these properties could be refactored to use the above methods. Obviously this would not apply to code that actually evaluated a property (such as polarizability) However I do see a problem: in the above I assumed that the methods would be added to Element, seeing as that is the base of everything. However I'm not exactly sure that that is right, since chemicalElements.xml is loaded by ElementPTFactory. So getting an element via IsotopeFactory or Elements will not allow configuration using the BO data, which appears to be exclusively loaded in ElementPTFactory Is this a correct analysis? I am still confused as to why there is a PeriodicTableElement class in the first place and why we simply don't do all the stuff done in that class in Element itself (with a corresponding ElementFactory) Regarding the CAS numbers that are currently in chemicalElements.xml in the CDK repo - can those numbers be included in the BODR repo? Or must they be included as a CDK specific thing (so merged into the BODR data on the CDK side) ------------------------------------------------------------------- Rajarshi Guha <rg...@in...> GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE ------------------------------------------------------------------- CChheecckk yyoouurr dduupplleexx sswwiittcchh.. |
From: Egon W. <e.w...@sc...> - 2006-09-25 06:46:49
|
On Monday 25 September 2006 02:52, Rajarshi Guha wrote: > On Sun, 2006-09-24 at 08:57 +0200, Egon Willighagen wrote: > > Integration of BO data is important. I did not have time for this yet, so > > I filed the PMT. > > I was looking at the BODR and elements.xml in particular. The isotope information is most important at this moment I think. > >From what I understand, integrating the element data from the BODR will > > require some reworking of the ElementPTHandler class. > > As a result of using the BODR data it seems that it'd be a good idea to > add some more methods to Element (or PeriodicTableElement ?): > > getDensity() > getElectronAffinity() > getElectronegativity() > getBP() > getMP() > > Does this sound OK? Actually not. An element does not have a BP/MP/density, which are properties of crystal structures. Electron affinity and negativity might go in as 'property'. > Are there any major issues of updating chemicalElements.xml that I am > missing? No, don't think so. > The result is that code that use it's own fixed values of these > properties could be refactored to use the above methods. Obviously this > would not apply to code that actually evaluated a property (such as > polarizability) > > However I do see a problem: in the above I assumed that the methods > would be added to Element, seeing as that is the base of everything. > However I'm not exactly sure that that is right, since > chemicalElements.xml is loaded by ElementPTFactory. So getting an > element via IsotopeFactory or Elements will not allow configuration > using the BO data, which appears to be exclusively loaded in > ElementPTFactory > > Is this a correct analysis? The element info is not so important to the CDK lib; I don't think we have much to do with compound properties. That is, I don't think any algorithm depends on it... More importantly, we need to update the IsotopeFactory to make use of the BO data, instead of our own. > I am still confused as to why there is a PeriodicTableElement class in > the first place and why we simply don't do all the stuff done in that > class in Element itself (with a corresponding ElementFactory) See above. Moreover, the PTElement even has stuff like group etc. > Regarding the CAS numbers that are currently in chemicalElements.xml in > the CDK repo - can those numbers be included in the BODR repo? Or must > they be included as a CDK specific thing (so merged into the BODR data > on the CDK side) Not sure... anyone? Egon -- e.w...@sc... Cologne University Bioinformatics Center (CUBIC) Blog: http://chem-bla-ics.blogspot.com/ GPG: 1024D/D6336BA6 |
From: Rajarshi G. <rg...@in...> - 2006-09-25 12:58:12
|
On Mon, 2006-09-25 at 08:43 +0200, Egon Willighagen wrote: > On Monday 25 September 2006 02:52, Rajarshi Guha wrote: > > On Sun, 2006-09-24 at 08:57 +0200, Egon Willighagen wrote: > > > Integration of BO data is important. I did not have time for this yet, so > > > I filed the PMT. > > > > I was looking at the BODR and elements.xml in particular. > > The isotope information is most important at this moment I think. That should not be too difficult to include > > getDensity() > > getElectronAffinity() > > getElectronegativity() > > getBP() > > getMP() > > > > Does this sound OK? > > Actually not. An element does not have a BP/MP/density, which are properties > of crystal structures. Aah, correct. > > The result is that code that use it's own fixed values of these > > properties could be refactored to use the above methods. Obviously this > > would not apply to code that actually evaluated a property (such as > > polarizability) > > > > However I do see a problem: in the above I assumed that the methods > > would be added to Element, seeing as that is the base of everything. > > However I'm not exactly sure that that is right, since > > chemicalElements.xml is loaded by ElementPTFactory. So getting an > > element via IsotopeFactory or Elements will not allow configuration > > using the BO data, which appears to be exclusively loaded in > > ElementPTFactory > > > > Is this a correct analysis? > > The element info is not so important to the CDK lib; I don't think we have > much to do with compound properties. That is, I don't think any algorithm > depends on it... Well when a descriptor asks for electronegativity or polarizability it should come from the CDK rather than being coded in the class or supplied by another data file. So I agree that compound properties can be ignored - but basic elemental info should be available > > More importantly, we need to update the IsotopeFactory to make use of the BO > data, instead of our own. True > > I am still confused as to why there is a PeriodicTableElement class in > > the first place and why we simply don't do all the stuff done in that > > class in Element itself (with a corresponding ElementFactory) > > See above. Moreover, the PTElement even has stuff like group etc. OK - but group info is relevant to Element itself. What is PTElement providing over Element that could not be in Element? PTElement does not include the compound-based properties (which in hindsight is correct), but it does provide info about group etc. Since elements are defined in the periodic table doesn't it make sense to have Element store this info? ------------------------------------------------------------------- Rajarshi Guha <rg...@in...> GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE ------------------------------------------------------------------- There's no problem so bad that you can't add some guilt to it to make it worse. -Calvin |
From: Rajarshi G. <rg...@in...> - 2006-09-25 18:37:01
|
On Mon, 2006-09-25 at 08:43 +0200, Egon Willighagen wrote: > On Monday 25 September 2006 02:52, Rajarshi Guha wrote: > > On Sun, 2006-09-24 at 08:57 +0200, Egon Willighagen wrote: > > > Integration of BO data is important. I did not have time for this yet, so > > > I filed the PMT. > > > > I was looking at the BODR and elements.xml in particular. > > The isotope information is most important at this moment I think. Updated ------------------------------------------------------------------- Rajarshi Guha <rg...@in...> GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE ------------------------------------------------------------------- A list is only as strong as its weakest link. -- Don Knuth |