[Rdkit-discuss] SMILES Parse Error - '\n'
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
|
From: Imran S. <imr...@gm...> - 2011-10-24 20:29:59
|
Hi -
This is probably a trivial issue but if you're creating a database using the
rdkit postgres (Pg) cartridge it may save you some time.
I used the Wiki sample code to test ~400000 smiles structures, and to write
the valid ones to a file.
E.g.
smi = smi.replace('=N#N','=[N+]=[N-]').replace('N#N=','[N-]=[N+]=')
mol =Chem.MolFromSmiles(smi)
Although there were errors like the following:-
SMILES Parse Error: syntax error while parsing
SMILES Parse Error: unclosed ring for input: [H]/N=c/1
The function returned a Mol object, which I wrote to a tab-delimited file
for loading into Pg using the 'copy table' command (using psql):
This failed due to the following:-
=# ERROR: smiles 'Cc1cccc(c1)/N=N/C(=N' could not be parsed
It turns out there were '\n' (newline) chars in the smiles strings (which
obviously shouldn't be there). Just a smi.replace('\n','') worked in this
case.
Hope this helps.
Cheers,
Imran
|