[Rdkit-discuss] Rf not supported in SMILES input
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
|
From: Andrew D. <da...@da...> - 2012-04-28 11:07:37
|
Minor bug report. RDKit's periodic table knows about "Rf", but the SMILES and SMARTS parsers do not.
Here's a reproducible
from rdkit import Chem
Chem.MolFromSmiles("[Rf]")
or for an entire scan:
from rdkit import Chem
get_symbol = Chem.GetPeriodicTable().GetElementSymbol
# Is there a way to get the max supported element number
# in RDKit's periodic table?
for i in range(1, 105):
symbol = get_symbol(i)
print i
Chem.MolFromSmiles("[%s]" % symbol)
The fix should be just a few changes to the SMILES and SMARTS lexers. Adding 'Rf' to SMARTS should be fine since 'f' is not a legal SMARTS pattern, so while 'R' means 'ring atom', 'Rf' cannot otherwise occur in valid SMARTS.
Also, Rf is already supported in SMILES output, so nothing needs to change there:
>>> from rdkit import Chem
>>> mol = Chem.MolFromSmiles("*")
>>> atom = list(mol.GetAtoms())[0]
>>> atom.SetAtomicNum(104)
>>> Chem.MolToSmiles(mol)
'[RfH6]'
>>> atom.GetSmarts()
'[RfH]'
>>>
BTW, I am confused about what GetSmarts() does here. It says "Rf atom with at least one hydrogen", which would match the given [RfH6], but I expected either the generic SMARTS of '[Rf]' or the specific SMARTS of '[RfH6]', but not something in between.
Why does GetSmarts() return what it does? More generically, which is the intended use of that function?
Andrew
da...@da...
|