Menu

#898 Round Trip SMILES changes double bonds to single bonds

2.3.x
pending
nobody
None
5
2014-03-15
2013-10-09
No

I am using openbabel 2.3.2, on a windows 7 32 bit machine.

I found this discrepancy from the command line, but also using the openbabel GUI

I started with the following SMILES, obtained from the nih cactus server:
[P]2(OC1=C(C(C)(C)C)C=C(C(C)(C)C)C=C1C3=C(O2)C(=CC(=C3)C(C)(C)C)C(C)(C)C)OCCN(CCO[P]5OC4=C(C(C)(C)C)C=C(C(C)(C)C)C=C4C6=C(O5)C(=CC(=C6)C(C)(C)C)C(C)(C)C)CCO[P]8OC7=C(C(C)(C)C)C=C(C(C)(C)C)C=C7C9=C(O8)C(=CC(=C9)C(C)(C)C)C(C)(C)C

if I translate this starting SMILES to SMILES, I get the following result:
p1(oc2c(C(C)(C)C)cc(C(C)(C)C)cc2c2c(o1)c(cc(c2)C(C)(C)C)C(C)(C)C)OCCN(CCOp1oc2c(C(C)(C)C)cc(C(C)(C)C)cc2c2c(o1)c(cc(c2)C(C)(C)C)C(C)(C)C)CCOp1oc2c(C(C)(C)C)cc(C(C)(C)C)cc2c2c(o1)c(cc(c2)C(C)(C)C)C(C)(C)C

and the following formula results from the original smiles, using the -oreport option:
FORMULA: C90H132NO9P3
MASS: 1464.9337
EXACT MASS: 1463.9114947

next, I take the resulting smiles, as listed above, copied and pasted, and use it as the input:
p1(oc2c(C(C)(C)C)cc(C(C)(C)C)cc2c2c(o1)c(cc(c2)C(C)(C)C)C(C)(C)C)OCCN(CCOp1oc2c(C(C)(C)C)cc(C(C)(C)C)cc2c2c(o1)c(cc(c2)C(C)(C)C)C(C)(C)C)CCOp1oc2c(C(C)(C)C)cc(C(C)(C)C)cc2c2c(o1)c(cc(c2)C(C)(C)C)C(C)(C)C

When i convert to -report, and view the graphics results, I see that all of the double bonds are converted to sigle bonds:
FORMULA: C90H168NO9P3
MASS: 1501.2195
EXACT MASS: 1500.1931959

I am sure that this is not the intended behavior, to electronically hydrogenate a molecule. Please have a look at this when time allows.

Tim J

Discussion

  • Timothy Janota

    Timothy Janota - 2013-10-09

    Additional information.

    On further investigation on my end, the root problem seems to be that the phosphorous has been converted to aromatic (lowercase P). If I manually change the first aromatic phosphorous(p) to non-aromatic phosphorous(P), the double bonds are found. For example, the smiles:
    P1(oc2c(C(C)(C)C)cc(C(C)(C)C)cc2c2c(o1)c(cc(c2)C(C)(C)C)C(C)(C)C)OCCN(CCOP1oc2c(C(C)(C)C)cc(C(C)(C)C)cc2c2c(o1)c(cc(c2)C(C)(C)C)C(C)(C)C)CCOP1oc2c(C(C)(C)C)cc(C(C)(C)C)cc2c2c(o1)c(cc(c2)C(C)(C)C)C(C)(C)C
    DOES give the expected formula result and aromatic rings:
    FORMULA: C90H132NO9P3
    MASS: 1464.9337
    EXACT MASS: 1463.9114947

     
  • Geoff Hutchison

    Geoff Hutchison - 2014-03-15

    This should now be fixed in the development "master". If you need a source-code patch or a binary, please let me know.

     
  • Geoff Hutchison

    Geoff Hutchison - 2014-03-15
    • status: open --> pending
    • Group: 2.0_alpha/beta --> 2.3.x