Hi,
In the SmartsParser.java file there is this code under the registerIndex method:
if (rc == null) { //Currently this index is not associated with any atom RingClosure rc1 = new RingClosure(); rc1.firstAtom = prevAtom; if (curBond == null) rc1.firstBond = curBondType; else newError("Use of a bond expression for the first appearence of atom index",curChar+1,""); indexes.put(i,rc1); //After first index appearance current bond data must be reset //If not a bug is caused when ring closure is with a double bond //e.g. CC=1CC=1 is parsed like it is CC1=CC=1 curBond = null; curBondType = SmartsConst.BT_UNDEFINED; }
Note the "error" indicated here. I'm trying to parse this:
[#6]-c1cccc2-[#7]3C-,:4(=[#7]-[#6]-5-[#7][C]3(=O)[#6]-3-[#6]-[#6]C(=O)[#7]-3-#6-c~3cc-,:44ccccc4~[#7]~3[#6]-5~[#8]-c12)c1ccccc1
The bond query (-,:) before the "4" ring causes this to error out. Why is this the case? I tested it on DEPICT (Daylight's renderer) and it's perfectly valid SMARTS.
Thanks in advance,
Ed.
This issue is resolved. "Use of a bond expression for the first appearence of atom index" is no longer a SMARTS parser error. Please use the latest snapshot from here:
http://ambit.uni-plovdiv.bg:8083/nexus/index.html#nexus-search;gav~~ambit2-smarts~2.4.13-SNAPSHOT~~
By the way the given test SMARTS string is not correct:
[#6]-c1cccc2-[#7]3C-,:4(=[#7]-[#6]-5-[#7][C]3(=O)[#6]-3-[#6]-[#6]C(=O)[#7]-3-#6-c~3cc-,:44ccccc4~[#7]~3[#6]-5~[#8]-c12)c1ccccc1
I think that fragment "...-3-#6-c~3cc..." should be "...-3-[#6]-c~3cc..." i.e. brackets [ ] are missing around "#6" otherwise it is interpreted as a ring closure -#6 with index 6 (which is not closed till the end of course) instead of of the element #6.
so the correct string should be:
[#6]-c1cccc2-[#7]3C-,:4(=[#7]-[#6]-5-[#7][C]3(=O)[#6]-3-[#6]-[#6]C(=O)[#7]-3-[#6]-c~3cc-,:44ccccc4~[#7]~3[#6]-5~[#8]-c12)c1ccccc1
This SMARTS string is strange in many ways. I would like to make two additional comments as well:
"C-,:4" is with a correct syntax but it is not needed at all. "-,:" means single or aromatic bond, but by default the SMARTS syntax treats like this all bonds that are not specified i.e. C4 is the same as C-,:4.
This fragment is quite strange as well "..cc-,:44cc.." - for the same atom, a ring index 4 is closed and a new ring closure is started with the same index 4. Generally it is a good practice that the ring indexes are not repeated at all in the entire SMARTS string and as much more this is a quite bad practice to have the same ring indices at the same atom.
Although strange this syntax is correct. It is recognized by daylight.com and by ambit-smarts as well.
I may assume that this SMARTS is an auto-generated one by a program, other wise if done by hand in this manner, it is prone to many errors not following the good practices of the SMARTS syntax.
With best regards
Dr. Nikolay Kochev
Related
Feature Requests:
#6Feature Requests:
#7Feature Requests:
#8Hi,
I used ChemAxon to generate this SMARTS string. Wierdly it doesn't load into MarvinSketch - I can only assume that I copied it wrong (though the error is still present which's what counts). Apologies for that. ChemAxon does seem to repeat ring indices, which is something I'm yet to observe in any other SMARTS generation tool.
Thank you for the fix. Out of interest is the source code for this version available anywhere, now that this is effectively the latest version?
Ed.
Thanks for the feedback, we are aware all SMARTS implementations come with own peculiarities.
Regarding the source code, 2.4.13-SNAPSHOT is the current development version at trunk https://svn.code.sf.net/p/ambit/code/trunk/ambit2-all/ambit2-smarts/
The Maven repository also supports source code artifacts.
Thanks again, I'll let you know if I have any more problems with this particular class.