From: Egon W. <e.w...@sc...> - 2006-06-17 13:47:48
|
Hi all, I have been profiling the PDBReader, I found that the major problem is the contains() check in addAtom() in AtomContainer, Strand and BioPolymer. Taking these out, brings down running the PDBReaderTest from 27 sec to 11 sec. Do we really need this check? We leave a lot of checking to the user in other places by policy too... Egon -- e.w...@sc... Cologne University Bioinformatics Center (CUBIC) Blog: http://chem-bla-ics.blogspot.com/ GPG: 1024D/D6336BA6 |
From: Rajarshi G. <rx...@ps...> - 2006-06-17 14:02:22
|
On Sat, 2006-06-17 at 15:46 +0200, Egon Willighagen wrote: > Hi all, > > I have been profiling the PDBReader, I found that the major problem is the > contains() check in addAtom() in AtomContainer, Strand and BioPolymer. Taking > these out, brings down running the PDBReaderTest from 27 sec to 11 sec. > > Do we really need this check? We leave a lot of checking to the user in other > places by policy too... It is a useful utility method. Why not keep it in there and note it as an expensive method? ------------------------------------------------------------------- Rajarshi Guha <rx...@ps...> <http://jijo.cjb.net> GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE ------------------------------------------------------------------- "whois awk?", sed Grep. |
From: Egon W. <e.w...@sc...> - 2006-06-17 14:07:26
|
On Saturday 17 June 2006 16:02, Rajarshi Guha wrote: > On Sat, 2006-06-17 at 15:46 +0200, Egon Willighagen wrote: > > Do we really need this check? We leave a lot of checking to the user in > > other places by policy too... > > It is a useful utility method. Why not keep it in there and note it as > an expensive method? Mark the addAtom() as expensive?? Egon -- e.w...@sc... Cologne University Bioinformatics Center (CUBIC) Blog: http://chem-bla-ics.blogspot.com/ GPG: 1024D/D6336BA6 |
From: Rajarshi G. <rx...@ps...> - 2006-06-17 14:12:31
|
On Sat, 2006-06-17 at 16:05 +0200, Egon Willighagen wrote: > On Saturday 17 June 2006 16:02, Rajarshi Guha wrote: > > On Sat, 2006-06-17 at 15:46 +0200, Egon Willighagen wrote: > > > Do we really need this check? We leave a lot of checking to the user in > > > other places by policy too... > > > > It is a useful utility method. Why not keep it in there and note it as > > an expensive method? > > Mark the addAtom() as expensive?? Sorry I misread the original post. What I wanted to say is that performing the check could be made optional and if specified (via a boolean for example) it would result in addAtom being an expensive (in terms of time) method. ------------------------------------------------------------------- Rajarshi Guha <rx...@ps...> <http://jijo.cjb.net> GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE ------------------------------------------------------------------- All life evolves by the differential survival of replicating entities. -- Dawkins |
From: Christoph S. <c.s...@un...> - 2006-06-17 17:28:27
|
Rajarshi Guha wrote: > On Sat, 2006-06-17 at 16:05 +0200, Egon Willighagen wrote: >> On Saturday 17 June 2006 16:02, Rajarshi Guha wrote: >>> On Sat, 2006-06-17 at 15:46 +0200, Egon Willighagen wrote: >>>> Do we really need this check? We leave a lot of checking to the user= in >>>> other places by policy too... >>> It is a useful utility method. Why not keep it in there and note it a= s >>> an expensive method? >> Mark the addAtom() as expensive?? >=20 > Sorry I misread the original post. >=20 > What I wanted to say is that performing the check could be made optiona= l > and if specified (via a boolean for example) it would result in addAtom > being an expensive (in terms of time) method. I don't think that we need to check this in the PDBReader. After all, if I interpret the situation correctly, this would only happen= in=20 case of a corrupt PDB file, and we do reading here and not validation. I would take it out. Cheers, Chris --=20 Priv. Doz. Dr. Christoph Steinbeck (c.s...@un...) Head of the Research Group for Molecular Informatics Cologne University BioInformatics Center (http://almost.cubic.uni-koeln.d= e) Z=FClpicher Str. 47, 50674 Cologne Tel: +49(0)221-470-7426 Fax: +49 (0) 221-470-7786 What is man but that lofty spirit - that sense of enterprise. ... Kirk, "I, Mudd," stardate 4513.3.. |
From: Egon W. <e.w...@sc...> - 2006-06-17 17:33:46
|
On Saturday 17 June 2006 18:14, Christoph Steinbeck wrote: > I don't think that we need to check this in the PDBReader. > After all, if I interpret the situation correctly, this would only happen > in case of a corrupt PDB file, and we do reading here and not validation. I > would take it out. This testing is not part of the PDBReader... it's part of the default data classes; it has nothing to do with (in)valid data files... Currently, the contains() is always called, whenever addAtom() is called... Egon -- e.w...@sc... Cologne University Bioinformatics Center (CUBIC) Blog: http://chem-bla-ics.blogspot.com/ GPG: 1024D/D6336BA6 |