From: Karol L. <kar...@kn...> - 2007-05-02 21:50:03
|
Some thoughts about more refactoring to the parser... If you take a look at the parsers after the recent refactoring, it is now more evident that they are quite inefficient. That isn't a problem, since cclib isn't about efficiency, but it would be nice. For example, even something as simple as putting a 'return' statement at the end of each parsing block would speed things up (the following conditions are not evaluated). Anyway, this already suggests that it would be useful to break up the extract() method into pieces, one for each block of parsed output. I've been hovering around this subject for some time, and turning it around in my mind. A dictionary of functions seems appropriate (with regexps or something as keys), and more easy to manage that the current "long" function. I don't think we can do away with the functions, since sometimes pretty complicated operations are done with the parsed output. The problem I see is where to define all these functions (30-40 separate parsed blocks)? How about this: the functions would be defined in a different class, not LogFile. What I'm suggesting, is to separate from the class that represents a parsed log file a class that represents the parser. Currently, they are one. An instance of the parser class would be an attribute of the log file class, say "_parser". This object would hold all the parsing functions and a dict used by the parse() method of LogFile, and any other stuff needed for parsing. An additional advantage is that the parser becomes less visible to the casual user, leaving only parsed attributes in the log file object. Summarizing, I propose two layers of classes: LogFile - subclasses Gaussian, GAMES, ... LogFileParser - subclasses GaussianParser, GAMESSParser, ... The first remains as is (at least for the user), except that everything related to parsing is put in the second. Of course, instances of the latter class should be attributes of the instances of the former. Waiting to hear what you think about this idea, Karol -- written by Karol Langner Thu May 3 01:20:44 CEST 2007 |