Re: [Rdkit-discuss] Deuterium/Tritium labels in Molfile
Open-Source Cheminformatics and Machine Learning
Brought to you by:
glandrum
From: Paolo T. <pao...@gm...> - 2023-04-11 10:18:01
|
<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"></head><body dir="auto"><div dir="ltr"><meta http-equiv="content-type" content="text/html; charset=utf-8"><div dir="ltr"><meta http-equiv="content-type" content="text/html; charset=utf-8"><div dir="ltr"></div><div dir="ltr">Dear Santiago,</div><div dir="ltr"><br></div><div dir="ltr">Using D and T symbols for deuterium and tritium in MDL molfiles is outside the file format specification.</div><div dir="ltr">Nonetheless, RDKit correctly parses those non-standard D and T symbols when reading an MDL molfile that contains them, as you can verify yourself through a simple test and also looking at the source code:</div><div dir="ltr"><br></div><div dir="ltr"><div style="display: block;" class=""><div style="-webkit-user-select: all; -webkit-user-drag: element; display: inline-block;" class="apple-rich-link" draggable="true" role="link" data-url="https://github.com/rdkit/rdkit/blob/36c4ec9e2ba4f5edba39f452cb7458230d9d99bc/Code/GraphMol/FileParsers/MolFileParser.cpp#L1506"><a style="border-radius:10px;font-family:-apple-system, Helvetica, Arial, sans-serif;display:block;-webkit-user-select:none;width:300px;user-select:none;-webkit-user-modify:read-only;user-modify:read-only;overflow:hidden;text-decoration:none;" class="lp-rich-link" rel="nofollow" href="https://github.com/rdkit/rdkit/blob/36c4ec9e2ba4f5edba39f452cb7458230d9d99bc/Code/GraphMol/FileParsers/MolFileParser.cpp#L1506" dir="ltr" role="button" draggable="false" width="300"><table style="table-layout:fixed;border-collapse:collapse;width:300px;background-color:#E9E9EB;font-family:-apple-system, Helvetica, Arial, sans-serif;" class="lp-rich-link-emailBaseTable" cellpadding="0" cellspacing="0" border="0" width="300"><tbody><tr><td vertical-align="center" align="center"><img alt="rdkit.png" src="cid:3A69F5D6-A23C-45E7-94B0-81441EE89E36"></td></tr><tr><td vertical-align="center"><table bgcolor="#E9E9EB" cellpadding="0" cellspacing="0" width="300" style="font-family:-apple-system, Helvetica, Arial, sans-serif;table-layout:fixed;background-color:rgba(233, 233, 235, 1);" class="lp-rich-link-captionBar"><tbody><tr><td style="padding:8px 0px 8px 0px;" class="lp-rich-link-captionBar-textStackItem"><div style="max-width:100%;margin:0px 16px 0px 16px;overflow:hidden;" class="lp-rich-link-captionBar-textStack"><div style="word-wrap:break-word;font-weight:500;font-size:12px;overflow:hidden;text-overflow:ellipsis;text-align:left;" class="lp-rich-link-captionBar-textStack-topCaption-leading"><a rel="nofollow" href="https://github.com/rdkit/rdkit/blob/36c4ec9e2ba4f5edba39f452cb7458230d9d99bc/Code/GraphMol/FileParsers/MolFileParser.cpp#L1506" style="text-decoration: none" draggable="false"><font color="#000000" style="color: rgba(0, 0, 0, 1);">rdkit/MolFileParser.cpp at 36c4ec9e2ba4f5edba39f452cb7458230d9d99bc · rdkit/rdkit</font></a></div><div style="word-wrap:break-word;font-weight:400;font-size:11px;overflow:hidden;text-overflow:ellipsis;text-align:left;" class="lp-rich-link-captionBar-textStack-bottomCaption-leading"><a rel="nofollow" href="https://github.com/rdkit/rdkit/blob/36c4ec9e2ba4f5edba39f452cb7458230d9d99bc/Code/GraphMol/FileParsers/MolFileParser.cpp#L1506" style="text-decoration: none" draggable="false"><font color="#A2A2A9" style="color: rgba(60, 60, 67, 0.6);">github.com</font></a></div></div></td></tr></tbody></table></td></tr></tbody></table></a></div></div><a href="https://github.com/rdkit/rdkit/blob/36c4ec9e2ba4f5edba39f452cb7458230d9d99bc/Code/GraphMol/FileParsers/MolFileParser.cpp#L2179">https://github.com/rdkit/rdkit/blob/36c4ec9e2ba4f5edba39f452cb7458230d9d99bc/Code/GraphMol/FileParsers/MolFileParser.cpp#L2179</a></div><div dir="ltr"><br></div><div dir="ltr">However, when writing the molfile, RDKit will write it according to specifications, i.e. using the H symbol and adding a “M ISO” entry. Any MDL molfile parser should be able to correctly parse such a file. ChemDraw will even automatically label the atoms as D and T, while MarvinJS will add the “2” and “3” superscript prefixes.</div><div dir="ltr"><br></div><div dir="ltr">To me, it seems a bit overkill to add a flag to preserve non-standard features in MDL molfile writing. Why would you be interested in doing that?</div><div dir="ltr"><br></div><div dir="ltr">Cheers,</div><div dir="ltr">p.</div><div dir="ltr"><br><blockquote type="cite">On 11 Apr 2023, at 11:50, Santiago Fraga <san...@me...> wrote:<br><br></blockquote></div><blockquote type="cite"><div dir="ltr"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);" class="elementToProof"> Many thanks for your examples, Wim.</div> <div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);" class="elementToProof"> But I was checking the option to save the labels D and T in the molfile for the hydrogen isotopes,</div> <div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);" class="elementToProof"> as other tools can do.</div> <div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);" class="elementToProof"> <br> </div> <div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);" class="elementToProof"> Regards</div> <div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);" class="elementToProof"> Santiago</div> <div class="elementToProof"> <div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);"> <br> </div> <div id="Signature"> <div> <div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);"> <img width="800" height="40" alt="" style="font-family:"Times New Roman";font-size:medium;text-align:start" src="http://www.mestrelab.com/mestrelab/wp-content/uploads/signs/line3.jpg" data-unique-identifier=""><span style="font-family:"Times New Roman";font-size:medium;text-align:start;display:inline !important"></span> <table width="810" style="font-size:medium;text-align:start;font-family:Segoe, "Segoe UI", "DejaVu Sans", "Trebuchet MS", Verdana, sans-serif"> <tbody> <tr> <td width="120" valign="top" style="text-align:left"><a href="http://www.mestrelab.com" target="_blank"><img width="120" height="132" alt="" src="http://www.mestrelab.com/mestrelab/wp-content/uploads/signs/M-red-200pxb.jpg" data-unique-identifier=""></a></td> <td width="185" valign="top" style="text-align:left"> <p><span style="font-size: 18px; color: rgb(129, 130, 131);">SANTIAGO FRAGA</span><br> <span style="font-size: 14px; color: rgb(148, 193, 31);"><strong><em>Software Developer</em></strong></span><br> <a href="mailto: san...@me..." style="font-size: 14px; color: rgb(132, 132, 132);">san...@me...</a></p> </td> <td width="213" valign="top" style="text-align:left"> <p style="font-size: 12px; color: rgb(129, 130, 131);"><strong><em>MESTRELAB RESEARCH S.L.</em></strong><br> <span style="font-size:11px"><span style="color: rgb(148, 193, 31);">PHONE<span> </span></span><em>+34881976775</em><span style="color: rgb(148, 193, 31);"><br> FAX<span> </span></span><em>+34981941079</em><br> Feliciano Barrera, 9B-Bajo 15706<br> Santiago de Compostela (SPAIN)</span></p> </td> <td width="202" valign="top" style="text-align:left"> <p style="font-size: 12px; color: rgb(129, 130, 131);">Follow us:<br> <a href="https://twitter.com/mestrelab" target="_blank"><img alt="Mestrelab Twitter" src="http://www.mestrelab.com/mestrelab/wp-content/uploads/signs/twitter-30px.jpg" data-unique-identifier=""></a><span> </span> <a href="https://www.linkedin.com/company/mestrelab-research" target="_blank"><img alt="Mestrelab Linkedin" src="http://www.mestrelab.com/mestrelab/wp-content/uploads/signs/linkedin-30px.jpg" data-unique-identifier=""></a><span> </span> <a href="https://www.youtube.com/channel/UCf3MVnd3XZflv0acvTv14ww"><img alt="Canal de YouTube Mestrelab" src="http://www.mestrelab.com/mestrelab/wp-content/uploads/signs/youtube-30px.jpg" data-unique-identifier=""></a><span> </span> <a href="http://mestrelab.com/blog/" target="_blank"><img alt="MestreBlog" src="http://www.mestrelab.com/mestrelab/wp-content/uploads/signs/blog-mestrelab-30px.jpg" data-unique-identifier=""></a></p> <p> </p> </td> </tr> </tbody> </table> <br> </div> </div> </div> </div> <div id="appendonsend"></div> <hr style="display:inline-block;width:98%" tabindex="-1"> <div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>De:</b> Wim Dehaen <wim...@gm...><br> <b>Enviado:</b> lunes, 10 de abril de 2023 18:07<br> <b>Para:</b> Santiago Fraga <san...@me...><br> <b>Cc:</b> rdk...@li... <rdk...@li...><br> <b>Asunto:</b> Re: [Rdkit-discuss] Deuterium/Tritium labels in Molfile</font> <div> </div> </div> <div> <div dir="ltr"> <div>rdkit outputs a molfile with correct isotope labels for me using just:</div> <div><br> </div> <div>mol=Chem.MolFromSmiles("[3H]c1ccccc1[2H]")<br> Chem.MolToMolFile(mol,"test.mol")</div> <div><br> </div> <div>or labelling the atoms post hoc:</div> <div><br> </div> <div>mol=Chem.MolFromSmiles("c1ccccc1")<br> mol=Chem.AddHs(mol)<br> mol.GetAtomWithIdx(6).SetIsotope(3)<br> mol.GetAtomWithIdx(7).SetIsotope(2)<br> mol=Chem.RemoveHs(mol)<br> Chem.MolToMolFile(mol,"test2.mol")</div> <div><br> </div> <div>I hope this helps<br> </div> <div><br> </div> <div> <div>best wishes</div> <div>wim<br> </div> <div><br> </div> </div> </div> <br> <div class="x_gmail_quote"> <div dir="ltr" class="x_gmail_attr">On Mon, Apr 10, 2023 at 4:43 PM Santiago Fraga <<a href="mailto:san...@me...">san...@me...</a>> wrote:<br> </div> <blockquote class="x_gmail_quote" style="margin:0px 0px 0px 0.8ex; border-left:1px solid rgb(204,204,204); padding-left:1ex"> <div class="x_msg5693932737842893224"> <div dir="ltr"> <div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)"> Good afternoon!</div> <div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)"> <br> </div> <div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)"> I am a relatively new user of RDKit, and mainly the C++ API.</div> <div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)"> <br> </div> <div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)"> I am trying to save in a molfile the labels D and T for the hydrogen isotopes.</div> <div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)"> Like in the following molfile:</div> <div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)"> <br> </div> <blockquote style="margin-top:0px; margin-bottom:0px"> <div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)"> <div> MJ230401 </div> <div><br> </div> <div> 8 8 0 0 0 0 0 0 0 0999 V2000</div> <div> -0.3572 0.4125 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0</div> <div> -1.0716 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0</div> <div> -1.0716 -0.8250 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0</div> <div> -0.3572 -1.2375 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0</div> <div> 0.3572 -0.8250 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0</div> <div> 0.3572 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0</div> <div> -0.3572 1.2375 0.0000 T 0 0 0 0 0 0 0 0 0 0 0 0</div> <div> 1.0717 0.4125 0.0000 D 0 0 0 0 0 0 0 0 0 0 0 0</div> <div> 3 4 2 0 0 0 0</div> <div> 4 5 1 0 0 0 0</div> <div> 5 6 2 0 0 0 0</div> <div> 6 1 1 0 0 0 0</div> <div> 1 2 2 0 0 0 0</div> <div> 2 3 1 0 0 0 0</div> <div> 6 8 1 0 0 0 0</div> <div> 1 7 1 0 0 0 0</div> M END</div> <div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)"> <br> </div> </blockquote> <div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)"> I am trying to set directly the labels in the hydrogen atoms:</div> <div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)"> <br> </div> <div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)"> atom->setProp<string>("atomLabel", "D");</div> <div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)"> or<br> <div> atom->setProp<string>("_displayLabel", "D");</div> <div><br> </div> <div> But when the molfile is generated the labels are not transferred.</div> <div> It seems also that when reading a mofile including the labels, they are discarded.</div> <div><br> </div> <div><br> </div> </div> <div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)"> Many thanks in advance</div> <div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0); background-color:rgb(255,255,255)"> Santiago Fraga</div> <div> <div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"> <br> </div> <div id="x_m_5693932737842893224Signature"> <div> <div style="font-family:Calibri,Arial,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)"> <br> </div> </div> </div> </div> </div> _______________________________________________<br> Rdkit-discuss mailing list<br> <a href="mailto:Rdk...@li..." target="_blank">Rdk...@li...</a><br> <a href="https://lists.sourceforge.net/lists/listinfo/rdkit-discuss" rel="noreferrer" target="_blank">https://lists.sourceforge.net/lists/listinfo/rdkit-discuss</a><br> </div> </blockquote> </div> </div> <span>_______________________________________________</span><br><span>Rdkit-discuss mailing list</span><br><span>Rdk...@li...</span><br><span>https://lists.sourceforge.net/lists/listinfo/rdkit-discuss</span><br></div></blockquote></div></div></body></html> |