From: Robert H. <ha...@st...> - 2010-05-12 21:09:12
|
got it! Check this site: http://chemapps.stolaf.edu/jmol/docs/examples-11/JmolSmilesTest.htm Good luck breaking it! On Wed, May 12, 2010 at 12:16 PM, Robert Hanson <ha...@st...> wrote: > OK, I found the bug. This can be fixed.... > > > On Wed, May 12, 2010 at 10:37 AM, Robert Hanson <ha...@st...>wrote: > >> Not surprising that there are some issues -- let's identify them and see >> what is up. >> >> >> On Wed, May 12, 2010 at 9:49 AM, william reusch <whr...@ms...> wrote: >> >>> Bob, >>> >>> This is very nice indeed, but after testing various Smiles strings on >>> your demo page, the results are mixed. >>> >> >> >>> When extra hydrogens are added to a structure (e.g. >>> trans-4-methylcyclohexanol), the new Smiles matches the one from the >>> same structure lacking the hydrogens. >> >> >> good >> >> >>> However, if the same Smiles >>> string lacking the hydrogens is compared with itself your demo reports >>> they are different. >> >> >> Really. What's the string exactly? I need to know that - you mean >> >> find "xxxxx" in "xxxxx" (same exact strings) >> >> fails? >> >> >>> Curiously, comparing identical strings from the >>> added hydrogen structures reports a match. >>> >>> >> It's the first that is curious, not this! >> >> >>> My chief interest concerns the comparison of strings in which added >>> hydrogens are not present. >> >> >> If the string is correct, Jmol will calculate the missing hydrogens and go >> with that. >> But I think, yes, there could be an issue with >> >> [CH2]([H])CCC... >> >> not for stereochemistry, but for direct match. Hmm... Ah, is this an >> example? >> >> find "CC[CH]([H])C" in "CCCC" --> 0 matches >> >> That's actually an invalid SMILES string, I think. Here we are saying one >> of the carbons has only one attached H, but then we are connecting it to >> another H, and it really has two. Now, the CCCC will be expanded to the >> proper model, but because that C really has two H atoms attached, and >> "CC[CH]([H])C will be expanded (correctly?) as well. But as a search pattern >> it will fail. >> >> Here's an interesting point, and maybe it is what you are making, I'm not >> sure: >> >> cis-4-methylcyclohexanol: >> >> find "[H]C1(C)CCC(O)CC1" in "[H][C@@]1(C)CC[C@H](O)CC1" >> >> not a problem. >> >> find "[H][C@@]1(C)CC[C@H](O)CC1" in "[H]C1(C)CCC(O)CC1" >> >> fails -- because the target does not have stereochemistry indicated -- the >> indicated stereochemistry was not found. >> >> >> >> >> >> >> >> >> >> >> >>> This is usually the way in which I ask >>> students to write stereo-formulas with JME, because in most cases the >>> formulas are unambiguous and require less drawing. Your demo does not >>> recognize different Smiles strings from identical structures unless >>> extra hydrogens are added to at least one of the strings. >> >> >> If that's the case, it's just a bug, either in Jmol or JME. Should not be >> a problem. >> >> >> >>> If you really >>> want to test your ability to recognize different strings created by >>> drawing the same structure in different ways use inositol as a model. >>> Myoinositol can have adozen different strings. >>> >>> >> definitely - let's do that. >> >> >>> Bill >>> >>> >>> >>> >>> >>> >>> Robert Hanson wrote: >>> > It doesn't matter how you draw the structure. >>> > >>> > It doesn't matter if you include extra hydrogen atoms or not, because >>> > what Jmol is doing is creating a topologically (though not >>> > dimensionally) correct model from the SMILES string, then checking it. >>> > So it doesn't matter how you indicate H atoms or what order the >>> > stereochemical notation of the SMILES string ends up being. It should >>> > work. >>> > >>> > Jmol should give a definitive answer as to whether the structure drawn >>> > matches your specified SMILES string, as long as your string is valid. >>> > Or, if you wish, it can tell you if a subset of the SMILES string is >>> > some particular grouping of atoms, with or without stereochemistry. >>> > >>> > JmolSmilesApplet.jar >>> > ----------------------------- >>> > >>> > You can see a demo of this at >>> > http://chemapps.stolaf.edu/jmol/docs/examples-11/JmolSmiles.htm >>> > >>> > This page uses a mini-version of Jmol that I just made that JUST does >>> > ONE THING -- checks SMILES strings for patterns. It's just 41K in >>> > size. But you can use Jmol itself if you want. This mini version just >>> > has one function: >>> > >>> > var retValue = document.getElementById("JmolSmiles1").find("pattern", >>> > "smilesString", asSMARTS, isAll) >>> > >>> > where >>> > >>> > asSMARTS (true or false) indicates whether you want a substructure >>> > search (SMARTS, true) or an exact search (SMILES, false) >>> > >>> > isAll indicates whether you want Jmol to return the total number of >>> > matches or just the number 1 indicating a match and 0 indicating no >>> > match. (-1 means there was an error handling the string) >>> > >>> > >>> > 3D: >>> > >>> > Jmol will also check a 3D model against a SMILES string or a SMARTS >>> > pattern. Within Jmol, this is done within the SELECT command using the >>> > smiles() function (for exact match) or using the search() function >>> > (for a substructure search). You can also use this construct within >>> Jmol: >>> > >>> > Var x = {*/1.1}.find("smarts","C=O", true) >>> > >>> > This function is actually VERY powerful and can return either a set of >>> > all matching atoms (false) or a list of sets of atoms (true). I've >>> > added one bit to SMARTS -- so I'm calling it now "3D-SEARCH" -- that >>> > allows you to select out WHICH atoms you want returned. To do this, >>> > just add { } around the atoms you want returned. >>> > >>> > So, for example: >>> > >>> > print {*}.find("a") # all aromatic atoms >>> > >>> > print {*}.find("{C}=O") # all carbonyl carbons >>> > >>> > print {*}.find("{C}=CC(=O)[O,N]") # all beta carbons on alpha-beta >>> > conjugated esters or amides >>> > >>> > Cool, huh? >>> > >>> > The only problem with matching a 3D structure may be with what other >>> > programs use to define "aromatic". Jmol should do just fine with >>> > structures that are typically aromatic. In addition, though, it will >>> > assign all the ring carbons of quinone to be aromatic as well (which >>> > is what JME does). Basically it defines aromatic as "flat ring that is >>> > all sp2-hybridized" regardless of what the bonding indicates. (This is >>> > kind of cool, because you can then use it in PDB files to find all the >>> > aromatic rings in HIS, TYR, TRP, A, T, C, G, etc.) >>> > >>> > Bob >>> > >>> > >>> ------------------------------------------------------------------------------ >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> Jmol-users mailing list >>> Jmo...@li... >>> https://lists.sourceforge.net/lists/listinfo/jmol-users >>> >> >> >> >> -- >> Robert M. Hanson >> Professor of Chemistry >> St. Olaf College >> 1520 St. Olaf Ave. >> Northfield, MN 55057 >> http://www.stolaf.edu/people/hansonr >> phone: 507-786-3107 >> >> >> If nature does not answer first what we want, >> it is better to take what answer we get. >> >> -- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900 >> > > > > -- > Robert M. Hanson > Professor of Chemistry > St. Olaf College > 1520 St. Olaf Ave. > Northfield, MN 55057 > http://www.stolaf.edu/people/hansonr > phone: 507-786-3107 > > > If nature does not answer first what we want, > it is better to take what answer we get. > > -- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900 > -- Robert M. Hanson Professor of Chemistry St. Olaf College 1520 St. Olaf Ave. Northfield, MN 55057 http://www.stolaf.edu/people/hansonr phone: 507-786-3107 If nature does not answer first what we want, it is better to take what answer we get. -- Josiah Willard Gibbs, Lecture XXX, Monday, February 5, 1900 |