Hi all,

I managed to dig up a message that seems to back up the reason for the current behaviour (see below). We could keep the D<n> behaviour as it is, but I needed the ‘*’ for matching MMFF forcefield types. Since this was dealing with a hydrogen unsuppressed graph (i.e. 3D model) I needed the ability to match heavy atoms next to any atom (including hydrogen).

To me, it makes more sense that ‘*’ doesn’t match explicit hydrogens when i’ve preprocessed the structure and removed them (not automatic) rather than automatically exclude hydrogens.


> 2. The Daylight SMARTS theory page says that the D<n> token
> represents <n> explicit connections. When counting these connections,
> should explicit H's be included? The Daylight SMARTS test page
> suggests no:

That is correct.  Internally, a hydrogen is a hydrogen, and it doesn't matter whether it's represented as an atom or an H-count.  The D<n> spec is about non-hydrogen atoms, not about whether the toolkit coders elect to represent hydrogens explicitely.

> Matching [OD1] against CO and CO[H] both give 1 hit (explicit H
> matching turned off)
> 3. The Daylight spec for * indicates it matches any atom. But the
> Daylight depict service suggests that it will not match against H
> (unless explicit H matching is selected).

This last part ("unless explicit H matching is selected") was only for backwards compatibility, as I recall.  The Daylight Toolkit originally treated implicit/explicit H differently, but that was a disaster -- the idea that a SMARTS had a different meaning depending on how the molecule was represented internally was a brain-dead idea, and was quickly corrected in the early versions of the Daylight Toolkit.  The explicit-H matching was put in just so that old applications wouldn't break that relied on the old behavior.


On 15 Jul 2014, at 13:49, Rajarshi Guha <guhar@mail.nih.gov> wrote:

The patch is correct - I misread the daylight spec

On Tue, Jul 15, 2014 at 7:41 AM, Egon Willighagen <egon.willighagen@gmail.com> wrote:

could you please have a look at this patch:


It reverts code specifically added by you, but I cannot find enough
context to decide whether you code should be reverted or not...



E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: 0000-0001-7542-0286

Rajarshi Guha | http://blog.rguha.net
National Center for Advancing Translational Science
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
Cdk-devel mailing list