|
From: <ern...@ba...> - 2009-03-16 14:05:57
|
>On the other hand, if you're scanning through an entire database of molecules, then this strategy won't help. In that case, the serialization methods you're talking >about would be a benefit ... but there is no such method. If I can identify the main brakesman in the instantiation of the OBMol and some prcomputed data to speed this up can be serialized as ancillary data along with the Molfile, no full serialization may be needed. >In my experience, though, parsing a MOL file is not the bottleneck, it's the SMARTS matching that is slow. Serializing would help somewhat, but the OB SMARTS pattern >matcher is overdue for some serious optimizing. Unfortunately I'm out of my league with the intrinsics of the matcher, so i'll stick to fast instantiation for the moment. :-) For my sample search. Full search: 1844 ms Full search without real match: 625 ms Full search without real read and real match, only returning true: 94 ms So the match eats about 2/3 of the time BUT the reading of the molecule takes the majority of the rest. If i can seriously speed this up, another ~25% have been shaved off. I have taken a brief look at mdlformat.cpp. Aside from plain text parsing I found two calls that might be of interest: mol.AssignSpinMultiplicity(); and the _mapcd stuff. I think I now have to start commenting out code and test to find interesting candidates. :-) Mit freundlichen Grüßen Ernst-Georg Schmid _________________________________________ BBS-S&T-APS-AD Bayer Business Services GmbH Gebäude 4810 51368 Leverkusen, Deutschland Tel: +49 214 30 50250 Fax: +49 214 30 22999 E-Mail: ern...@ba... Web: http://www.BayerBBS.com Geschäftsführung: Vorsitzender Daniel Hartert | Arbeitsdirektor Norbert Fieseler Vorsitzender des Aufsichtsrats: Klaus Kühn Sitz der Gesellschaft: Leverkusen | Amtsgericht Köln, HRB 49895 |