Menu

#68 Plugin for additional SMILES: Choose type of SMILES

any future version
open
nobody
None
FR 31, BR 220
5
2016-09-13
2015-09-21
No

The plugin to calculate additional smiles for molecules or scaffolds could have the possibility to choose, which kind of smiles it should generate. In CDK 1.5 there are four different kinds of smiles (either canonical/non-canonical and chiral/non-chiral in all four combinations). In this case, we will not have to worry about consistency of smiles types, as the choice is then made by the user.

Discussion

  • Sven Schrinner

    Sven Schrinner - 2016-02-27
    • status: open --> in-progress
    • assigned_to: Sven Schrinner
    • Related To: -->
     
  • Sven Schrinner

    Sven Schrinner - 2016-03-16

    The feature is now partly usable with commit [1b7f2b]. Two things are not satisfying yet:

    1. There is no way to detect, whether the dataset contains chiral information or not. In SH 2.6.0 chiral information could be used or ignored for a new dataset. If they are ignored, it makes no sense to offer isomeric smile generation.
    2. Actually, the generated smiles contain no isomeric information at the moment. It seems that this information is lost, after the molecule string has been saved in the database. When the molecule is recreated out of this string, it obviously cannot contain these information.
     

    Related

    Commit: [1b7f2b]

  • Sven Schrinner

    Sven Schrinner - 2016-03-16

    Regaring 2:

    An AtomContainer, which is converted into a string by MDLV2000Writer and then reconverted into an AtomContainer by MDLReader, loses some of its properties. The property "stereoElements" originally had 1 element, but after the conversions, it is empty.

    Maybe someone can comment on this comparison between original file and intermediate conversion string. The only real difference is the "1" in the third line of the original.

    Original:

     MOE2013           2D
    
     24 26  0  0  1  0  0  0  0  0999 V2000
       -2.4660   -0.7620    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
       -1.6410   -0.7610    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
       -1.2280   -1.4760    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
       -0.4030   -1.4750    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
       -0.1060   -0.0780    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
       -1.6400   -2.1900    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
       -2.4650   -2.1910    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
       -2.8770   -2.9060    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
       -3.7020   -2.9060    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        0.1490   -0.8620    0.0000 S   0  0  1  0  0  0  0  0  0  0  0  0
        1.0340   -0.8600    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        1.5260   -1.5230    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
        1.5120   -0.1880    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
        2.2990   -0.4350    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        2.3080   -1.2600    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        3.0100   -0.0150    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        3.0260   -1.6650    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        3.7280   -0.4200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        3.7370   -1.2450    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        4.4750   -1.6120    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
        5.2570   -1.3490    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
       -1.2270   -2.9050    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
       -2.8780   -1.4760    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
       -3.7000   -1.4140    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
      1  2  2  0  0  0  0
      1 23  1  0  0  0  0
      2  3  1  0  0  0  0
      3  4  1  0  0  0  0
      3  6  2  0  0  0  0
      5 10  2  0  0  0  0
      6  7  1  0  0  0  0
      6 22  1  0  0  0  0
      7  8  1  0  0  0  0
      7 23  2  0  0  0  0
      8  9  1  0  0  0  0
     10  4  1  1  0  0  0
     10 11  1  0  0  0  0
     11 12  2  0  0  0  0
     11 13  1  0  0  0  0
     12 15  1  0  0  0  0
     13 14  1  0  0  0  0
     14 15  2  0  0  0  0
     14 16  1  0  0  0  0
     15 17  1  0  0  0  0
     16 18  2  0  0  0  0
     17 19  2  0  0  0  0
     18 19  1  0  0  0  0
     19 20  1  0  0  0  0
     20 21  1  0  0  0  0
     23 24  1  0  0  0  0
    M  END
    >  <GENERIC_NAME>
    Esomeprazole
    
    >  <MOLECULAR_WEIGHT>
    345.41599
    
    $$$$
    

    Converted:

      CDK     0316161414
    
     24 26  0  0  0  0  0  0  0  0999 V2000
       -2.4660   -0.7620    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
       -1.6410   -0.7610    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
       -1.2280   -1.4760    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
       -0.4030   -1.4750    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
       -0.1060   -0.0780    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
       -1.6400   -2.1900    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
       -2.4650   -2.1910    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
       -2.8770   -2.9060    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
       -3.7020   -2.9060    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        0.1490   -0.8620    0.0000 S   0  0  1  0  0  0  0  0  0  0  0  0
        1.0340   -0.8600    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        1.5260   -1.5230    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
        1.5120   -0.1880    0.0000 N   0  0  0  0  0  0  0  0  0  0  0  0
        2.2990   -0.4350    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        2.3080   -1.2600    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        3.0100   -0.0150    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        3.0260   -1.6650    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        3.7280   -0.4200    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        3.7370   -1.2450    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
        4.4750   -1.6120    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
        5.2570   -1.3490    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
       -1.2270   -2.9050    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
       -2.8780   -1.4760    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
       -3.7000   -1.4140    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
      1  2  2  0  0  0  0 
      1 23  1  0  0  0  0 
      2  3  1  0  0  0  0 
      3  4  1  0  0  0  0 
      3  6  2  0  0  0  0 
      5 10  2  0  0  0  0 
      6  7  1  0  0  0  0 
      6 22  1  0  0  0  0 
      7  8  1  0  0  0  0 
      7 23  2  0  0  0  0 
      8  9  1  0  0  0  0 
     10  4  1  1  0  0  0 
     10 11  1  0  0  0  0 
     11 12  2  0  0  0  0 
     11 13  1  0  0  0  0 
     12 15  1  0  0  0  0 
     13 14  1  0  0  0  0 
     14 15  2  0  0  0  0 
     14 16  1  0  0  0  0 
     15 17  1  0  0  0  0 
     16 18  2  0  0  0  0 
     17 19  2  0  0  0  0 
     18 19  1  0  0  0  0 
     19 20  1  0  0  0  0 
     20 21  1  0  0  0  0 
     23 24  1  0  0  0  0 
    M  END
    
     
  • Till Schäfer

    Till Schäfer - 2016-03-22

    Regarding 2:
    this seems to be bug outsite the scope of additional smiles generation, or am i wrong? The molecule should exacly match the imported one, but have the properties stripped (because they are converted to MoleculeProperties and are not needed anymore)

    Regarding 1:
    So why not simply use the dataset information about chirality? We do not need to detect, if there is some real chirality information present in the dataset. Or is this information not accessible to the plugin?

     

    Last edit: Till Schäfer 2016-03-22
  • Sven Schrinner

    Sven Schrinner - 2016-03-22

    Regarding 1:
    The information is not accessible in the plugin, yet. We would have to change the signature of the "getSettingsPanel" method and add a dataset argument to it, to retrieve the information. If that is OK, then the check should be easy.

    Regarding 2:
    Yes, it is a bug outside the plugin. But I also think that we cannot fix it easily. Maybe we should create a bug report for this?

     
  • Till Schäfer

    Till Schäfer - 2016-03-22

    Regarding 1
    am i right, that this in a plugin interface change? We should think about not blowing up this interface with arbitrary information. Nevertheless, the chirality information and in more general the dataset metadata seem to be valuable in some cases. Is there anyway to get the propertydefinitions inside a plugin up to this point? this seems to be another quite usefull information provided by the dataset object.

    We must be carefull to give a plugin a read-only access to this informaions. A plugin shopuld not be able to alter some dataset attributes.

    Any proposed solutions? Additional thoughts? Lets brainstorm a bit more. We should only change this once and not every few releases.

    If we change the signature the manual must be adjusted.

    Regarding 2
    yes, please create another bug report then and mark this bug to depend on the other bug.

     
  • Sven Schrinner

    Sven Schrinner - 2016-03-25
    • Depends On: 31 --> FR 31, BR 220
     
  • Till Schäfer

    Till Schäfer - 2016-08-11
    • status: in-progress --> open
     
  • Till Schäfer

    Till Schäfer - 2016-09-13
    • assigned_to: Sven Schrinner --> nobody
     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.