|
From: <ern...@ba...> - 2009-03-13 14:48:32
|
I already use fingerprints. :-) The problem arises afterwards: Let's say a substruct search for 'c1ccccc1F' yields 1810 hits from the screening stage. 1805 are valid, 5 are false positives. In order to find those false positives, all candidates have to be checked again through the SMARTSPattern. The main performance killer there is the instantiation of each target molecule from a textual representation. By changing from SMILES to V2000 molfile alone, I gained about 30% more throughput speed (because V2000 has explicit bonds I suppose). I further suppose, that saving a OBMol representation that already has ringset, aromaticity and whatnot perceived at registration time and just reading this back into memory should be even faster. Best Regards Ernst-Georg Schmid |