From: Daniel Z. <zah...@dt...> - 2008-05-02 13:17:33
|
On May 2, 2008, at 9:12 AM, Rajarshi Guha wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > On May 2, 2008, at 8:35 AM, Daniel Zaharevitz wrote: >> >> On May 2, 2008, at 7:06 AM, Daniel Zaharevitz wrote: >> >>> >>> On May 2, 2008, at 1:34 AM, Rajarshi Guha wrote: >>>> >>>> - - add missing elements to atom type perception code >>> >>> I'm not sure if I sent you the final list, but I used the version >>> from a few months ago to run through all our open compounds >>> (~270K) and check to see if the computer representation of the >>> structure matched with the listed molecular formula. This >>> resulted in about 16K failures due to unparameterized atom. Most >>> of these were various heavy atoms, but there were some N, O, C >>> type failures as well. I can certainly send you that error set, >>> but it might be more useful to rerun with the current version. >>> Let me know what you would find useful. >>> >>> >> >> I just realized that it is easy to get to the unparameterized >> atoms. I put the comparison result as a comment in the update we >> sent to PubChem a few months ago. If you search in PubChem >> Substances for "Unparameterized Atom" with limits of Source = DTP/ >> NCI you will get the compounds that failed in the first pass. > > Can you send the query string to pull up those entries on PubChem? > just can't get it to recognize the word 'unparametrized' > This is cut and pasted from the text field. The quote marks are included in the query "Unparameterized Atom" It gets 15970 hits. BTW, do you know of a way to share PubChem query results? DanZ /******************************************** Daniel Zaharevitz Chief, Information Technology Branch National Cancer Institute zah...@dt... ********************************************/ |