From: Rajarshi G. <rg...@in...> - 2008-05-21 21:55:36
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I filed this as a bug, but I think the behavior wrt atom type configuring (via AtomTypeManipulator.configure) could do with some discussion: - -- When I load a molecule from an SD file via MDLV2000Reader, the atoms have a valid (i.e., non-null) value for the exact mass field. When I then percieve and configure atom types via AtomContainerManipulator.perceiveAndConfigureAtoms, the exact mass field of all the atoms becomes NULL. This is because the CDKAtomTypeMatcher finds a matching atom type and the resulting IAtomType object has the exact mass field set to NULL. So when AtomTypeManipulator.configure configures the current atom with the matching type, it overwrites the exact mass field of the original atom. Why is the matcher returning a type with NULL for the exact mass? More generally, what is the policy for configuring atoms. Right now, we just overwrite any previous configuration. I think a more correct approach would be for AtomTypeManipulator.configure to check whether the fields of a atom are UNSET - if so, then do the configuration. If the fields are not UNSET don't do the configuration. This implies that AtomTypeManipulator should have a method that will 'clear' the configuratin related to atom types by setting the appropriate fields to UNSET - ------------------------------------------------------------------- Rajarshi Guha <rg...@in...> GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84 - ------------------------------------------------------------------- Q: Why did the mathematician name his dog "Cauchy"? A: Because he left a residue at every pole. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) iEYEARECAAYFAkg0mlEACgkQZqGSLFHnnoSOXACg5RX9w8PbrOrJGQL6aeb40ipk YWMAnj6ZfbcunSqF45PuzaTAqm/Hylmi =elZu -----END PGP SIGNATURE----- |
From: Egon W. <ego...@gm...> - 2008-06-28 12:26:38
|
Hi Rajarshi, On Wed, May 21, 2008 at 11:55 PM, Rajarshi Guha <rg...@in...> wrote: > I filed this as a bug, but I think the behavior wrt atom type > configuring (via AtomTypeManipulator.configure) could do with some > discussion: > - -- Sorry for the major backlog... > When I load a molecule from an SD file via MDLV2000Reader, the atoms > have a valid (i.e., non-null) value for the exact mass field. Where does that info come from? When was that set? > When I then percieve and configure atom types via AtomContainerManipulator > .perceiveAndConfigureAtoms, the exact mass field of all the atoms becomes NULL. Mmmm... I see your problem. > This is because the CDKAtomTypeMatcher finds a matching atom type and > the resulting IAtomType object has the exact mass field set to NULL. On the other hand, the original info might not be valid because of some edit option... > So when AtomTypeManipulator.configure configures the current atom > with the matching type, it overwrites the exact mass field of the original atom. Right... > Why is the matcher returning a type with NULL for the exact mass? Because that information requires info from IsotopeFactory... > More generally, what is the policy for configuring atoms. Set/reset all IAtomType field... > Right now, we > just overwrite any previous configuration. I think a more correct > approach > would be for AtomTypeManipulator.configure to check whether the > fields of a > atom are UNSET - if so, then do the configuration. If the fields are not > UNSET don't do the configuration. What if they are SET, but no longer valid? I guess we need a few common scenarios, and separate methods for those... > This implies that AtomTypeManipulator should have a method that will > 'clear' the configuratin related to atom types by setting the > appropriate fields to UNSET That's maybe a good idea... we could even already implement this as a new alternative method configureMissingFields()... Egon -- ---- http://chem-bla-ics.blogspot.com/ |
From: Rajarshi G. <rg...@in...> - 2008-06-29 21:18:11
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jun 28, 2008, at 8:26 AM, Egon Willighagen wrote: >> When I load a molecule from an SD file via MDLV2000Reader, the atoms >> have a valid (i.e., non-null) value for the exact mass field. > > Where does that info come from? When was that set? As far as I can tell, the reader code is doing it (which was surprising to me) >> This is because the CDKAtomTypeMatcher finds a matching atom type and >> the resulting IAtomType object has the exact mass field set to NULL. > > On the other hand, the original info might not be valid because of > some edit option... What do you mean edit option? This actually relates to the general problem that certain atom typing schemes provide certain property values (exact mass) whereas others do not. But it seems, that molecule loading should not lead to molecule configuration (unless certain properties are absolutely required) >> More generally, what is the policy for configuring atoms. > > Set/reset all IAtomType field... > >> Right now, we >> just overwrite any previous configuration. I think a more correct >> approach >> would be for AtomTypeManipulator.configure to check whether the >> fields of a >> atom are UNSET - if so, then do the configuration. If the fields >> are not >> UNSET don't do the configuration. > > What if they are SET, but no longer valid? > > I guess we need a few common scenarios, and separate methods for > those... I think SET but invalid, should not be achievable. In case an operation causes atom typing to be modified such that reperception is required, then it should set fields to UNSET (or even a flag indicating atom type perception to UNSET) > >> This implies that AtomTypeManipulator should have a method that will >> 'clear' the configuratin related to atom types by setting the >> appropriate fields to UNSET > > That's maybe a good idea... we could even already implement this as a > new alternative method configureMissingFields()... Actually I don't think this would be required, since this would lead to the scenario you mentioned above (what if a SET field needs to be repercieved). Rather, simply wiping out the configuration and reperceiving it would be a more complete operation (though a little longer). - ------------------------------------------------------------------- Rajarshi Guha <rg...@in...> GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84 - ------------------------------------------------------------------- All seems condemned in the long run to approximate a state akin to Gaussian noise. -- James Martin -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) iEYEARECAAYFAkhn/BMACgkQZqGSLFHnnoScbwCfTsG1s7QK6+V3qnY4TgM89+yD XbcAn0wDkPyH+qtpubHX73CAsZucspyJ =EAE7 -----END PGP SIGNATURE----- |