Re: [Rdkit-discuss] MACCS SMARTS pattern definitions
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
|
From: Greg L. <gre...@gm...> - 2011-05-27 03:19:51
|
Hi Andrew,
Second part of my response.
On Thu, May 26, 2011 at 4:02 PM, Andrew Dalke <da...@da...> wrote:
>
> * Bit 2 is
>
> #2:('[#103,#104,#105,#106,#107,#106,#109,#110,#111,#112]',0), # ISOTOPE Not complete
> 2:('[#103,#104]',0), # ISOTOPE Not complete
>
> I assume the comment is wrong, since this has nothing to do with isotopes.
>
> What's not complete about this definition, and/or why is the first one commented out?
You're right, the comment is wrong. The definition is also not
correct, the key should be atomic num>103.
The reason the more complete defn is commented out is that the RDKit
periodic table data only go up to #104. I added a comment to that
effect.
> * "*NOTE* spec wrong" occurs on many lines
>
> What does it mean?
I'm afraid that's lost in the sands of time. I will remove them.
>
> * Bit 3 is
>
> 3:('[Ge,As,Se,Sn,Sb,Te,Tl,Pb,Bi]',0), # Group IVa,Va,VIa Periods 4-6 (Ge...) *NOTE* spec wrong
>
> The "Tl" doesn't look right. Shouldn't the last three be Pb,Bi,Po ?
Yep.
> * Bit 18 is
>
> 18:('[B,Al,Ga,In,Tl]',0), # Group IIIA (B...) *NOTE* spec wrong
>
> Boron may be aromatic according to the SMILES spec, so this
> should be [B,b, ...] or [#5, ... ].
Fixed this.
> * Bit 44 is
>
> 44:('?',0), # OTHER
>
> Is this one of the undocumented bits or does "OTHER" mean
> something else?
It's undocumented
>
> * Bit 68 says
>
> FIX: incomplete definition
>
> Are there thoughts to complete this?
This is one where the spec is incomplete : it includes the amazingly
helpful (&...) at the end.
> My thought is that it isn't
> important one way or the other. Without a good validation set
> it would be hard to really pin this down.
Agreed.
>
> There are a number of other bits which are also marked "FIX:
> incomplete definition". Are they going to be fixed? Again, I
> don't think there's a pressing need without validation data.
Those also have (&...). I've updated the comment to make clear that
it's due to an incomplete spec.
I just checked in a set of changes reflecting the above.
-greg
|