From: Peter Murray-R. <pm...@ca...> - 2005-03-07 23:07:04
|
At 17:51 07/03/2005 -0500, Geoff Hutchison wrote: >Dear Peter, > >I've been working on testing the "new framework" which has much cleaner >support for reading multiple molecules in a single file. In particular, >last weekend I thought the roundtrip testing had uncovered some bug which >was making it run an infinite loop or similar crash. Fantastic - we were just discussing today what we wanted. Currently we cannot use CML with 2D coords (but see below) and wanted to know where the framework was at >No, instead it was just a result of very slow CML performance. With a >little debugging the problem was in endMolecule: > >bool endMolecule() { > generateInternals(); > InternalToCartesian(internalVector, *molPtr); > > molPtr->EndModify(); > > molPtr->ConnectTheDots(); > > if (outputDebug) { > // debug(cout); > } > return true; // [ejk] assumed >} I have no idea who wrote this (although I take responsibility). I suspect it was copied from some exemplar elsewhere, but it could have come from outer space. OTOH I don't recognise "[ejk]" >OK... so if I understand this bit, for *every* CML molecule read in, you >generate internal z-matrix coordinates, and then convert back to >cartesians? So even if I read in normal 2D or 3D cartesians, I do this >anyway. This seems like a waste of effort and a potential loss of accuracy >-- why not just use the cartesians you read and only worry about >InternalToCartesian if there are only internal coordinates available? why indeed :-) >Secondly, you run ConnectTheDots (which generates single-bond >connectivity) on all molecules, even if there's bonding information in the >CML file (e.g., bond or bondArray). > >Again, this seems like a waste of effort (and thus decreases performance >reading CML files) -- is there any particular reason for this? I suspect the whole lot could be yanked. The CML reflects what the author put in.and if there are bonds, then keep them asis, else don't calculate them. This may also fix our bug. There *is* a general problem which we have discussed on the list - if someone reads a format without some info and wants to add it, where should it be done. My feeling is in the Core of OB, but others suggested it should be in the input routine. Please hack the code in whatever way seems reasonable! P. >Cheers, >-Geoff Peter Murray-Rust Unilever Centre for Molecular Informatics Chemistry Department, Cambridge University Lensfield Road, CAMBRIDGE, CB2 1EW, UK Tel: +44-1223-763069 |