Menu

#625 chiral smiles may cause memory corruption

2.2.x
closed
5
2012-10-23
2010-08-10
No

This problem arose with OpenBabel 2.2.3 on Ubuntu 10.04. When presented with a chiral SMILES whose chiral center is bonded to more than four neighbors, src/formats/smilesformat.cpp may overrun memory. A patch is attached.

I'm not a chemist, so please pardon my lack of clarity :)

I encountered the problem while using babel to convert a proprietary SMILES in which a chiral center was bonded to more than four other atoms. Here is a non-proprietary sample SMILES which helps demonstrate the problem:
[N@@]1234([C@@]9([C@H]1[C@H]3[C@@H]4[C@@H]29))

If this is saved to a file named problem.smi, and if babel is run under valgrind to convert the smiles to another format, valgrind will detect the memory overrun:
$ valgrind babel -i smi problem.smi -o sdf problem.sdf
(See below for full valgrind output.)

The attached patch for src/formats/smilesformat.cpp addresses the problem by resizing (ChiralSearch->second)->refs as needed.

{{{
$ valgrind babel -i smi problem.smi -o sdf problem.sdf
==4287== Memcheck, a memory error detector
==4287== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==4287== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for copyright info
==4287== Command: babel -i smi problem.smi -o sdf problem.sdf
==4287==
==4287== Invalid write of size 4
==4287== at 0xC061A37: OpenBabel::OBSmilesParser::ParseComplex(OpenBabel::OBMol&) (smilesformat.cpp:2398)
==4287== by 0xC069DFA: OpenBabel::OBSmilesParser::ParseSmiles(OpenBabel::OBMol&) (smilesformat.cpp:988)
==4287== by 0xC06A3AB: OpenBabel::OBSmilesParser::SmiToMol(OpenBabel::OBMol&, std::string const&) (smilesformat.cpp:938)
==4287== by 0xC06AD27: OpenBabel::SMIBaseFormat::ReadMolecule(OpenBabel::OBBase, OpenBabel::OBConversion) (smilesformat.cpp:890)
==4287== by 0x4F0C864: OpenBabel::OBMoleculeFormat::ReadChemObjectImpl(OpenBabel::OBConversion, OpenBabel::OBFormat) (obmolecformat.cpp:93)
==4287== by 0x4EF5E67: OpenBabel::OBConversion::Convert() (obconversion.cpp:481)
==4287== by 0x4EF650E: OpenBabel::OBConversion::Convert(std::istream, std::ostream) (obconversion.cpp:420)
==4287== by 0x4EFBDB8: OpenBabel::OBConversion::FullConvert(std::vector<std::string, std::allocator<std::string=""> >&, std::string&, std::vector<std::string, std::allocator<std::string=""> >&) (obconversion.cpp:1321)
==4287== by 0x4040DE: main (babel.cpp:340)
==4287== Address 0x6250060 is 0 bytes after a block of size 16 alloc'd
==4287== at 0x4C28CC1: operator new(unsigned long) (vg_replace_malloc.c:261)
==4287== by 0x4E89E36: std::vector<unsigned int,="" std::allocator<unsigned="" int=""> >::operator=(std::vector<unsigned int,="" std::allocator<unsigned="" int=""> > const&) (new_allocator.h:89)
==4287== by 0xC06047E: OpenBabel::OBSmilesParser::ParseComplex(OpenBabel::OBMol&) (smilesformat.cpp:2276)
==4287== by 0xC069DFA: OpenBabel::OBSmilesParser::ParseSmiles(OpenBabel::OBMol&) (smilesformat.cpp:988)
==4287== by 0xC06A3AB: OpenBabel::OBSmilesParser::SmiToMol(OpenBabel::OBMol&, std::string const&) (smilesformat.cpp:938)
==4287== by 0xC06AD27: OpenBabel::SMIBaseFormat::ReadMolecule(OpenBabel::OBBase, OpenBabel::OBConversion) (smilesformat.cpp:890)
==4287== by 0x4F0C864: OpenBabel::OBMoleculeFormat::ReadChemObjectImpl(OpenBabel::OBConversion, OpenBabel::OBFormat) (obmolecformat.cpp:93)
==4287== by 0x4EF5E67: OpenBabel::OBConversion::Convert() (obconversion.cpp:481)
==4287== by 0x4EF650E: OpenBabel::OBConversion::Convert(std::istream, std::ostream) (obconversion.cpp:420)
==4287== by 0x4EFBDB8: OpenBabel::OBConversion::FullConvert(std::vector<std::string, std::allocator<std::string=""> >&, std::string&, std::vector<std::string, std::allocator<std::string=""> >&) (obconversion.cpp:1321)
==4287== by 0x4040DE: main (babel.cpp:340)
==4287==
}}}

Discussion

  • Mitch Chapman

    Mitch Chapman - 2010-08-10

    Patch for src/formats/smilesformat.cpp

     
  • Noel O'Boyle

    Noel O'Boyle - 2010-08-10

    Thanks for reporting the problem, but the patch causes Refs to be greater than 4. OpenBabel currently does not support chirality involving more than 4 stereo atoms (nor indeed does the SMILES spec - this is one of the limitations). We can probably only handle this by ignoring the stereochemistry at that atom. (Is this structure real?)

     
  • Mitch Chapman

    Mitch Chapman - 2010-08-11

    Thanks for the quick response.

    I failed to mention that we discovered the memory write error after OBConversion segfaulted on the proprietary SMILES. So ignoring stereochemistry would definitely be helpful. Would this also be accompanied by a warning or error message?

    I'm told the proprietary structure is real. The sample SMILES is definitely not :)

    In case it's useful -- admittedly unlikely, since I can't provide the proprietary structure: the original structure was delivered in an OpenBabel-generated SD file. The SMILES which caused the memory error was generated from the SD representation, using babel.

     
  • Noel O'Boyle

    Noel O'Boyle - 2010-08-16

    I just checked the current development code. It doesn't segfault - instead you get:

    >>> conv.ReadString(mol, "[N@@]1234([C@@]9([C@H]1[C@H]3[C@@H]4[C@@H]29))")

    *** Open Babel Error in OpenBabel::OBSmilesParser::InsertTetrahedralRef
    Error: Overwriting previously set reference id.

    Since it doesn't segfault, I'm marking this bug as closed.