#160 need for ring connections with more than 2 digits

open
nobody
Data Types (11)
5
2012-10-23
2011-04-04
sgarturo
No

Good day. I noticed that one cannot make ring connections with numbers of more than 2 digits in a SMILES string.

i.e. valid structures for diphenylmethane are c1ccccc1Cc2ccccc2 and c%84ccccc%84Cc%85ccccc%85 but c%1000ccccc%1000Cc%1001ccccc%1001 is not. The connections just do not happen as intended.

Is there a reason why we are limited to 2 digit ring connections? Is there a way to have this changed? Is there a way to do something like this within the existing framework in a clever manner?

Thanks.

Steve

Discussion

  • Geoff Hutchison

    Geoff Hutchison - 2011-04-04

    Well, the Daylight specification says that ring closures must stay at 2 digits. Certainly there are some extensions, but we try to minimize them -- certainly no other toolkit accepts 3-digit % connections.

    I think we'd need to see an example that makes sense. What are you trying to do that requires more than 100 open ring connections? Why can you not close some and re-use that number, e.g.

    c(s1)ccc1-c(s1)ccc1

     
  • sgarturo

    sgarturo - 2011-04-04

    I am trying to automate a reaction scheme with molecular fragments. So, I have two fragments in one smiles string i.e. CCCCO.CCCC(=O) I would like to connect them with an integer so that I can get an ester i.e. CCCCO%11.CCCC(%11)(=O) Reusing digits is not possible here.

    I would like to automate this for more than 100 connections. i'd like to increment the number of connections as I make them. In certain applications, not having more than 100 is problematic with the systems I consider. Maybe there is another way to do this, but the simplest right now (with my existing code and thought process) would be to have more than 2 digits available for these kinds of connections.

    Any suggestions? And thanks for the super-quick reply.

     
  • Noel O'Boyle

    Noel O'Boyle - 2011-04-05

    It's still not clear why you can't reuse digits. In your example CCCCO%11.CCCC(%11)(=O) you could reuse %11 after this point.

     
  • sgarturo

    sgarturo - 2011-04-07

    I did not know I can reuse digits here. I tried it in a small case and it worked fine. This can work in my code until, well, it doesn't work.

    This does not get to my original question of why 2 digits are the maximum number and whether this can be changed. I am sure that, in my work over the coming months, there will be an example where I need that 3 digit connection (either through the size of my system or through my lack of a strong algorithm that can most efficiently hand out 99 connections). What I will encounter is an arbitrary limitation of 2 digits (because Daylight says so, says that it is rare to need more than 2 digits).

    I will let you know on this board if/when that error occurs. I do appreciate your work and it has gotten me really far up to this point, so do take my comments as constructive criticism from an excited user.

     
  • Noel O'Boyle

    Noel O'Boyle - 2011-04-07

    I think this discussion would be better on the mailing list. There you will find one of the original developers of the SMILES in Daylight who can give you all the background.

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks