From: John M. <jo...@eb...> - 2013-10-28 10:23:37
|
Hi all, Just been fixing up some of the SMARTS code and I noticed some the SMARTS ‘[D<number>]’ will match the number of non-hydrogen atom connected. However it should match the number of explicit atoms connected. This behaviour was introduced by this bug report https://sourceforge.net/p/cdk/bugs/824/ - 'fixed by making sure that both explicit and implicit H counts are deducted from the number of connections of the target atom.’. The reason why it didn’t match in DEPICTMATCH is because hydrogens are suppressed by default. If you try ‘[D3]’ against ‘CC=CC’ with 'Enable explicit-H SMARTS’ you’ll see it does match the H counts also. Although it might be nice to make it so ‘D<number>’ does this I think the correct solution is to suppress hydrogens before matching? This also serves to normalise molecules and removes differences between '[H<number>]’ vs ‘[h<number>]’. Also the valency was currently being set to the number of valence electrons for each element (adjusted for charge). This doesn’t seem useful or follow the SMARTS documentation and has been changed so that ‘[v3]’ matches 'CN(O)C’ and not 'C=N(=O)C’ and ‘[v5]’ matches 'C=N(=O)C’ but not 'CN(O)C’. Thoughts? Cheers, J |