From: Craig A. J. <cj...@em...> - 2006-11-29 01:57:54
|
I have a "philosophical" question about symmetry and the meaning of the valence model of Open Babel, and a practical consequence. Consider this: CC(=O)[O-].[Na+] In nature, the two oxygens are symmetrically equivalent. But the valence model of chemistry has no way to represent half charges, so Open Babel represents this as an asymmetrical molecule. However, in OB's symmetry analysis, OBMol::GetGIDVector(), OB considers the two oxygen atoms of this molecule to be symmetrically equivalent. It uses the valence, rather than the bonds and charge, to determine the graph invarients, then never considers the actual bonds or charge when determining symmetry. The result is two atoms that are declared "identical" when they plainly are not. I believe Open Babel's current behavior is wrong. Since OB's internal representation is asymmetrical, it should be consistent, and GetGIDVector() should put the two oxygens in different symmetry classes. There is a disasterous practical consequence of this "philosophical" debate. Many algorithms "walk the graph" of a molecule to find features, and use a symmetry analysis for efficiency to cut down on redundant traversal. Imagine, for example, a fingerprinting algorithm that enumerates short paths. It walks down the C-C=O path and adds it to the fingerprint. Then it looks at the other oxygen and says, "hey, that's identical to the one I just looked at, so I'll skip it!" Then, the next time this substructure shows up, it might just happen that the fingerprinter walks down the "C-C-[O-]" path first, and skips the "C-C=O" path. You have two identical functional groups, but two different fingerprints. It's my opinion that OB should be consistent, even though it doesn't match nature. If OB represents a functional group asymmetrically, then GetGIDVector() should recognize the asymmetry. If someone has a modeling problem that needs to know that the two oxygens are equivalent, then they to develop a more sophisticated valence representation for their own needs. I need to fix this problem for my own needs, but it a full community discussion is needed before OB is changed. For now I'm just going to cut-and-paste the symmetry analysis code into my own application and fix it so that it works the way I thought it would/should. Craig |