[Jmol-developers] Re: [Jmol-users] hbonds in Jmol From: Miguel Howard - 2003-12-07 15:21 ```Tim, I understand that you are working on this. So don't answer the questions below. Just keep them in mind as you research a solution. Miguel > true - about 85% of the structures in the PDB are crystals, and I assume > most of them lack hydrogens. but we can infer the existence of a > hydrogen - rasmol does this very thing when calculating the default > sphere radii to use. (C, N, and O atoms that *should* have a hydrogen > are given modified vdw radii to account for the missing proton.) But RasMol is only looking at specific protein/DNA N & O atoms that they know are good candidates for Hbonding ... > in addition, based on the covalent bonding and group properties, we can > calculate the existence of likely lone pairs (for example, an O in a > carbonyl group will have two lone iars; an O in a hydroxyl will have one > LP and an assumed hydrogen). But, how do we tell whether a C bonded to an O is a C=O or C-O-H (given that there are no H atoms)? ```

[Jmol-developers] Re: [Jmol-users] hbonds in Jmol timothy driscoll <molvisions@ma...>
 [Jmol-developers] Re: [Jmol-users] hbonds in Jmol From: Miguel Howard - 2003-12-07 14:24 ```>> > - Given that there are no hydrogens in the file ... How do >> > you calculate the 'angles' from the bonds that are present? >> >> Don't know that... >> > calculate the position of the hydrogen on the donor. OK ... how? > 1. can the atom form an hbond? (based solely on electronegativity this > would be F, O, Cl, and N - possibly C and S.) OK > 2. if true, count up the hbond "potentials" - lone pair(s) and > hydrogen(s). if more than zero, add atom to table of donors and > acceptors. But, there are rarely any explicit hydrogens in .pdb files. > 3. are there any hbond partners within correct distance (if acceptor > look for donors; if donor, look for acceptors)? in the majority of > protein and nucleic acids, the answer here will be yes - there are very > few hbond potentials left unsatisfied. will have to loop this to > account for atoms with multiple hbond potentials. > > 4. if true, is the X-H-Y angle acceptable? I haven't thought about this > in depth yet but it would likely include calculating the position of the > involved H. Correct, but this may be required for the other things as well. > 5. if true, show hbond. > > > just a thought: if processing time is an issue (when isn't it?), Frankly, my initial answer is "No, processing time is not an issue". This code is executed once, at the user's request. Even if it takes one or two seconds that is OK. And so far I haven't seen anything which would lead me to believe that it should take that long. Miguel ```

 [Jmol-developers] Re: [Jmol-users] hbonds in Jmol From: timothy driscoll - 2003-12-07 14:49 ```hi Miguel, at 3.24p EDT on 2003 December 07 Sunday Miguel Howard said: > >> > - Given that there are no hydrogens in the file ... How do > >> > you calculate the 'angles' from the bonds that are present? > >> > >> Don't know that... > >> > > calculate the position of the hydrogen on the donor. > OK ... how? > working... [...] > > 2. if true, count up the hbond "potentials" - lone pair(s) and > > hydrogen(s). if more than zero, add atom to table of donors and > > acceptors. > But, there are rarely any explicit hydrogens in .pdb files. > true - about 85% of the structures in the PDB are crystals, and I assume most of them lack hydrogens. but we can infer the existence of a hydrogen - rasmol does this very thing when calculating the default sphere radii to use. (C, N, and O atoms that *should* have a hydrogen are given modified vdw radii to account for the missing proton.) in addition, based on the covalent bonding and group properties, we can calculate the existence of likely lone pairs (for example, an O in a carbonyl group will have two lone iars; an O in a hydroxyl will have one LP and an assumed hydrogen). > > 3. are there any hbond partners within correct distance (if acceptor > > look for donors; if donor, look for acceptors)? in the majority of > > protein and nucleic acids, the answer here will be yes - there are very > > few hbond potentials left unsatisfied. will have to loop this to > > account for atoms with multiple hbond potentials. > > > > 4. if true, is the X-H-Y angle acceptable? I haven't thought about this > > in depth yet but it would likely include calculating the position of the > > involved H. > Correct, but this may be required for the other things as well. > see my first response above ;-) > > 5. if true, show hbond. > > > > > > just a thought: if processing time is an issue (when isn't it?), > Frankly, my initial answer is "No, processing time is not an issue". > > This code is executed once, at the user's request. Even if it takes one or > two seconds that is OK. And so far I haven't seen anything which would > lead me to believe that it should take that long. > good. unless some kindly list subscriber chimes in first, I will continue investigating the process for calculating hydrogen position and get back to you as soon as possible... regards, :tim -- timothy driscoll molvisions - molecular graphics & visualization usa:north carolina:wake forest ```

 [Jmol-developers] Re: [Jmol-users] hbonds in Jmol From: Miguel Howard - 2003-12-07 15:21 ```Tim, I understand that you are working on this. So don't answer the questions below. Just keep them in mind as you research a solution. Miguel > true - about 85% of the structures in the PDB are crystals, and I assume > most of them lack hydrogens. but we can infer the existence of a > hydrogen - rasmol does this very thing when calculating the default > sphere radii to use. (C, N, and O atoms that *should* have a hydrogen > are given modified vdw radii to account for the missing proton.) But RasMol is only looking at specific protein/DNA N & O atoms that they know are good candidates for Hbonding ... > in addition, based on the covalent bonding and group properties, we can > calculate the existence of likely lone pairs (for example, an O in a > carbonyl group will have two lone iars; an O in a hydroxyl will have one > LP and an assumed hydrogen). But, how do we tell whether a C bonded to an O is a C=O or C-O-H (given that there are no H atoms)? ```

 [Jmol-developers] Re: [Jmol-users] hbonds in Jmol From: timothy driscoll - 2003-12-08 10:39 ```at 9.49a EDT on 2003 December 07 Sunday timothy driscoll said: > hi Miguel, > > > at 3.24p EDT on 2003 December 07 Sunday Miguel Howard said: > > > >> > - Given that there are no hydrogens in the file ... How do > > >> > you calculate the 'angles' from the bonds that are present? > > >> > > >> Don't know that... > > >> > > > calculate the position of the hydrogen on the donor. > > OK ... how? > > > working... > is this useful? it is already a part of the cdk package from openscience, if that makes a difference. regards, :tim -- timothy driscoll molvisions - molecular graphics & visualization usa:north carolina:wake forest ```

 [Jmol-developers] Re: [Jmol-users] hbonds in Jmol From: Miguel Howard - 2003-12-08 11:24 ```> is this useful? > > > > it is already a part of the cdk package from openscience, if that > makes a difference. Actually, that makes a big difference. Egon is one of the principals CDK, and we will use CDK to get support for more file types. Egon or Christoph will have to comment on how effective this would be with .pdb files that have neither hydrogens nor bond orders. Miguel ```

 [Jmol-developers] Re: [Jmol-users] hbonds in Jmol From: timothy driscoll - 2003-12-08 11:25 ```this one might be helpful as well... at 12.20p EDT on 2003 December 08 Monday Miguel Howard said: > > is this useful? > > > > > > > > it is already a part of the cdk package from openscience, if that > > makes a difference. > > Actually, that makes a big difference. > > Egon is one of the principals CDK, and we will use CDK to get support > for more file types. > > Egon or Christoph will have to comment on how effective this would be with > ..pdb files that have neither hydrogens nor bond orders. > > > Miguel > > > > ```

 Re: [Jmol-developers] Re: [Jmol-users] hbonds in Jmol From: Peter Murray-Rust - 2003-12-08 18:40 ```At 06:25 08/12/2003 -0500, timothy driscoll wrote: There are roughly three steps: Determine precisely which atoms have hydrogens and which don't. If they are explicitly given, OK. If not I suggest this shouldn't be a Jmol function but a CDK plugin (e.g. addMissingH with CDK) determine their coordinates. If they are already given, fine, else I have written a routine in CDK that does that. The problem comes with X-O-H bonds which can rotate around the X-O bond. This is a non-trivial problem. Again I think that should be delegated to another tool Calculate which atoms are H-bonded. In principle this is very difficult as people will argue about what is and isn't an HBond. There are conferences on this subject. I worked actively in this area ca. 20-25 years ago. There *are* C-H...O bonds but they depend on the substituents of the C. There *are* bifurcated bonds. There are ... There are two main strategies: - a write a tool to calculate H-bonds from geometry. This will solve 80% of the problems quickly and you'll spend the next 5 years chasing down the rest. - borrow someone else's algorithms (or better provide more than one). I suggest the latter if it's easy to do. I'd ask on CCL if anyone has an opensource tool to calculate H-bonds and what are the most widely used approaches. P. >this one might be helpful as well... > >html> > > >at 12.20p EDT on 2003 December 08 Monday Miguel Howard said: > > > > is this useful? > > > > > > > > > > > > > it is already a part of the cdk package from openscience, if that > > > makes a difference. > > > > Actually, that makes a big difference. > > > > Egon is one of the principals CDK, and we will use CDK to get support > > for more file types. > > > > Egon or Christoph will have to comment on how effective this would be with > > ..pdb files that have neither hydrogens nor bond orders. > > > > > > Miguel > > > > > > > > > > >------------------------------------------------------- >This SF.net email is sponsored by: IBM Linux Tutorials. >Become an expert in LINUX or just sharpen your skills. Sign up for IBM's >Free Linux Tutorials. Learn everything from the bash shell to sys admin. >Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click >_______________________________________________ >Jmol-developers mailing list >Jmol-developers@... >https://lists.sourceforge.net/lists/listinfo/jmol-developers Peter Murray-Rust Unilever Centre for Molecular Informatics Chemistry Department, Cambridge University Lensfield Road, CAMBRIDGE, CB2 1EW, UK Tel: +44-1223-763069 ```

 Re: [Jmol-developers] Re: [Jmol-users] hbonds in Jmol From: Miguel Howard - 2003-12-08 19:42 ```Peter, As you know, .pdb files normally don't contain either hydrogens or bond order. Therefore, when you see a C-O you don't know if it is a C=O or a C-O-H Can the code currently in CDK deal with this? > If not I suggest this shouldn't be a Jmol > function but a CDK plugin (e.g. addMissingH with CDK) In general, I agree. RasMol/Chime includes some simple H-bond detection for a few special cases (protein backbones & 'classic' DNA) So we have been looking for something 'simple' to put in Jmol to support existing scripts that use the 'hbonds on' command. Any complicated hbond calculations should certainly happen in the context of CDK. > There are two main strategies: > - a write a tool to calculate H-bonds from geometry. This will solve 80% > of the problems quickly and you'll spend the next 5 years chasing down > the rest. > > - borrow someone else's algorithms (or better provide more than one). > > I suggest the latter if it's easy to do. That sounds like the right approach. > I'd ask on CCL if anyone has an > opensource tool to calculate H-bonds and what are the most widely used > approaches. What is CCL? Miguel > ```

 Re: [Jmol-developers] Re: [Jmol-users] hbonds in Jmol From: Peter Murray-Rust - 2003-12-09 09:11 ```At 20:41 08/12/2003 +0100, Miguel Howard wrote: >Peter, > >As you know, .pdb files normally don't contain either hydrogens or bond >order. Yes. NMR structures sometimes contain *some* hydrogens. XRay ones normally don. >Therefore, when you see a C-O you don't know if it is a C=O or a C-O-H Yes. It's an adventure >Can the code currently in CDK deal with this? In general only the author and god know what the actual structures are (often only god). The protein backbone is usually OK. The side chain amino acids/bases may/mayNot be protonated. For many ligands it is impossible to tell what the stuff is at all, let alone where the H atoms are. This is over drastic for many structures but the real work should be done by data depositors and not viewing programs. And better formats. > > If not I suggest this shouldn't be a Jmol > > function but a CDK plugin (e.g. addMissingH with CDK) >In general, I agree. > >RasMol/Chime includes some simple H-bond detection for a few special cases >(protein backbones & 'classic' DNA) > >So we have been looking for something 'simple' to put in Jmol to support >existing scripts that use the 'hbonds on' command. > >Any complicated hbond calculations should certainly happen in the context >of CDK. Or elsewhere. > > There are two main strategies: > > - a write a tool to calculate H-bonds from geometry. This will solve 80% > > of the problems quickly and you'll spend the next 5 years chasing down > > the rest. > > > > - borrow someone else's algorithms (or better provide more than one). > > > > I suggest the latter if it's easy to do. >That sounds like the right approach. > > > I'd ask on CCL if anyone has an > > opensource tool to calculate H-bonds and what are the most widely used > > approaches. >What is CCL? Computational Chemistry List ccl.net. Ask something like: What programs are there which can either/or (a) add missing hydrogens in PDB files (b) calculate HBonds. and offer to post the replies to the list. P. >Miguel > > > > > > > >------------------------------------------------------- >This SF.net email is sponsored by: IBM Linux Tutorials. >Become an expert in LINUX or just sharpen your skills. Sign up for IBM's >Free Linux Tutorials. Learn everything from the bash shell to sys admin. >Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click >_______________________________________________ >Jmol-developers mailing list >Jmol-developers@... >https://lists.sourceforge.net/lists/listinfo/jmol-developers Peter Murray-Rust Unilever Centre for Molecular Informatics Chemistry Department, Cambridge University Lensfield Road, CAMBRIDGE, CB2 1EW, UK Tel: +44-1223-763069 ```

 Re: [Jmol-developers] Re: [Jmol-users] hbonds in Jmol From: Miguel Howard - 2003-12-09 09:20 ```> And better formats. Formats like ... CML for example ... :-) Miguel ```

 Re: [Jmol-developers] Re: [Jmol-users] hbonds in Jmol From: Peter Murray-Rust - 2003-12-09 12:13 ```At 10:20 09/12/2003 +0100, Miguel Howard wrote: > > And better formats. > >Formats like ... CML for example ... :-) The crystallographic community has designed mmCIF for this purpose. CML can be used for proteins but some of its support is limited although I am getting requests to add PDB-like fields. P. >Miguel > > > > > >------------------------------------------------------- >This SF.net email is sponsored by: IBM Linux Tutorials. >Become an expert in LINUX or just sharpen your skills. Sign up for IBM's >Free Linux Tutorials. Learn everything from the bash shell to sys admin. >Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click >_______________________________________________ >Jmol-developers mailing list >Jmol-developers@... >https://lists.sourceforge.net/lists/listinfo/jmol-developers Peter Murray-Rust Unilever Centre for Molecular Informatics Chemistry Department, Cambridge University Lensfield Road, CAMBRIDGE, CB2 1EW, UK Tel: +44-1223-763069 ```

 [Jmol-developers] Re: [Jmol-users] hbonds in Jmol From: timothy driscoll - 2003-12-09 13:06 ```at 10.40a EDT on 2003 December 09 Tuesday Peter Murray-Rust said: > At 10:20 09/12/2003 +0100, Miguel Howard wrote: > > > > And better formats. > > > >Formats like ... CML for example ... :-) > > The crystallographic community has designed mmCIF for this purpose. CML can > be used for proteins but some of its support is limited although I am > getting requests to add PDB-like fields. > > P. CML or mmCIF would help address the current question of hydrogen bond implementation as well. the ideal location for atomic data like occupancy, valence, etc. is in the structure file; the viewer software should not have to do this. (I believe someone else, maybe Peter, said this earlier so I just want to agree.) hydrogen bond calculation depends on data like this. when you talk about offloading hydrogen bond calculation to a plugin, I am forced to agree. given the complexity, and the other calculations required, maybe we can revert to a simple backbone hydrogen bonding algorithm for Jmol, with the plugin to add hydrogens and calculate potential hydrogen bonds. in any case, what file formats does Jmol plan to support? regards, :tim -- timothy driscoll molvisions - molecular graphics & visualization usa:north carolina:wake forest ```

 [Jmol-developers] Jmol file formats From: Miguel Howard - 2003-12-09 14:07 ```> in any case, what file formats does Jmol plan to support? Tim, The following is a collection of random statements about the current status of Jmol/CDK file support and integration: The development version of Jmol has an interface called the 'JmolModelAdapter'. This interface separates the file IO from Jmol data structures that are used for Display. The idea is to allow one to run Jmol on top of any Java molecular modeling package. Currently there are two implementations, the CdkJmolModelAdapter and the SimpleModelAdapter. The Jmol beta test code that you have been running is using the 'SimpleModelAdapter' which only supports .xyz, .mol, and .pdb files. It is not using CDK. The currently released version of Jmol support 15 or 20 file formats. mmCIF is one of them. However, in some cases it only reads out a subset of the data out of the files. For example, the CDK/Jmol code for .pdb files only read the ATOM and HETATM records. There are some performance problems with CDK io. Again taking .pdb as an example, reading hemoglobin takes over 8 seconds on my Linux box. (The simple model adapter takes .75 seconds) I have not taken a look at the CDK code for mmCIF, but I suspect it will have the same issues. In addition, there is an overall problem of size. Including CDK in the JmolApplet makes the size go from +/- 320K to +/- 700K. I have tried to come up with a clean mechanism to split out the CDK file support into a separate .jar file that would be downloaded only if needed, but to date have not been terribly successful. We need to get these issues worked out, but recently I have been focusing on other functionality within Jmol. Miguel ```

 Re: [Jmol-developers] Jmol file formats From: timothy driscoll - 2003-12-09 14:18 ```at 3.06p EDT on 2003 December 09 Tuesday Miguel Howard said: > > in any case, what file formats does Jmol plan to support? > Tim, > > The following is a collection of random statements about the current > status of Jmol/CDK file support and integration: > [...] > > We need to get these issues worked out, but recently I have been focusing > on other functionality within Jmol. > hi Miguel, thanks for the summary. I agree there are other, more pressing issues for Jmol at the moment. regarding hydrogen bonding, I think initial hbonding could be implemented based on a Rasmolian style algorithm - limited to protein backbone and nucleic acid bases. a more useful hbonding protocol would involve either: 1. a more useful file format, like mmCIF or CML, that includes valence and occupancy data, plus a plug-in to calculate hydrogen positions. 2. calculating all of the data from (1) within Jmol itself. I think (1) is better. ;-) regards, :tim -- timothy driscoll molvisions - molecular graphics & visualization usa:north carolina:wake forest ```