From: Daniel Z. <zah...@dt...> - 2008-05-02 12:35:29
|
On May 2, 2008, at 7:06 AM, Daniel Zaharevitz wrote: > > On May 2, 2008, at 1:34 AM, Rajarshi Guha wrote: >> >> - - add missing elements to atom type perception code > > I'm not sure if I sent you the final list, but I used the version > from a few months ago to run through all our open compounds (~270K) > and check to see if the computer representation of the structure > matched with the listed molecular formula. This resulted in about > 16K failures due to unparameterized atom. Most of these were > various heavy atoms, but there were some N, O, C type failures as > well. I can certainly send you that error set, but it might be more > useful to rerun with the current version. Let me know what you > would find useful. > > I just realized that it is easy to get to the unparameterized atoms. I put the comparison result as a comment in the update we sent to PubChem a few months ago. If you search in PubChem Substances for "Unparameterized Atom" with limits of Source = DTP/NCI you will get the compounds that failed in the first pass. DanZ /******************************************** Daniel Zaharevitz Chief, Information Technology Branch National Cancer Institute zah...@dt... ********************************************/ |