From: Karol L. <kar...@kn...> - 2007-02-02 15:43:43
|
On Friday 02 of February 2007 12:57, you wrote: > On 02/02/07, Karol Langner <kar...@kn...> wrote: > > Thanks... it might be worthwhile to move some more things to > > LogFile.pares, but I need your input on these. > > > > 1) The stuff before the loop (inputfile.seek, initialize self.progress, > > etc.). For this, inputfile, nstep, and oldstep would need to be > > attributes of the class. However, since they are temporary (parsing is > > never aborted before it's finished, I think), this may not be a good > > idea. > > Do they have to be attributes of the class? Can they be local > variables in parse() which are passed into extract() (or maybe this is > too messy)? Alternatively, they can be 'weak' attributes, with names > like self._nstep. AFAIK, this is a convention to indicate 'private' > variables of a class, although it doesn't do much else. Yes, probably better to pass them as arguments, for now at least. > > 2) The code that uses cupdate, fupdate, nstep, and oldstep (condition "if > > self.progress and random.random() < fupdate") is repeated multiple times, > > but the same question arises as the above. > > Well, I think that this code can be moved into a function. Is there a difference in the meaning of the numbers cupdate and fupdate, beyond what I can see in the code? > > Could you perhaps explain the full purpose of LogFile.progress? > > It's main purpose is for GUI applications to be able to display a > progress bar showing how near to completion the parsing is. This is > because parsing can take more than 20 seconds for large files > containing population data (which is what both myself and Adam were > interested in). You should try out PyMOlyze to see this in action. > There is some cost in seconds in including the progress code, but > blindingly fast parsing is not the main goal of cclib. For instance, > it is possible to rewrite our multiple 'if' statements in other forms > that would parse large files quicker. If you care about this, it might > be worth thinking about, bearing in mind that "premature optimization > is the root of all evil" or something. I don't care about optimization a bit, at least at this point. I do hope, though, that some extent of refactoring will make writing new parsers and extending the present ones a little easier. I'm going to try out pyMOlyze and GaussSum soon, when I find a bit more free time. > > Maybe there > > would be some advantages in making the logfile file object an attribute > > of LogFile (as in self.intputfile)? > > IMO the fewer attributes the better. Since it's a 'derived attribute' > (i.e. it can be recreated from self.filename), and is only used within > extract(), I don't see how it can be useful. If you can think of a > useful reason for doing this though, go ahead. No, I don't :) Karol -- written by Karol Langner Fri Feb 2 16:29:18 CET 2007 |