From: Karol L. <kar...@kn...> - 2007-08-05 22:04:35
|
There had been some discussions about some comp. chem. programs printing output across many files (GAMESS, Molpro, and lately Turbomole). The only way to parse from multiple files into one data object until now has been to concatenate them and parse the combined file, not a preferable solution. I just commited changes that allow cclib to parse a sequence of files. The FileInput class from the standard Python library fileinput allows many files to be parsed seamlessly the samy as regular file objects (thus no changes in the parser code!). You can now pass lists and tuples to openlogfile and ccopen in the filename argument, and get a FileInput object back instead of a regular file object. Compression with gzip and bzip2 are supported as of Python 2.5. You can see this feature at work in the Molpro GeoOpt unittest - two files are parsed (.out and .log). I modified the test system a bit to accomodate this: additional data files are given in the last columns of test/testdata, as you can see in this case. This Molpro test still doesn't pass, but the errors are not due to the new feature :) A few things need to be considered/fixed in this area: 1) Possibly support zip files containing more than one file (now an error). 2) Parsing many files into one data object can potentially mess things up, even for legitimate logfiles (we haven't parsed GAMESS .log files yet, for example). The most obvious problem, repeated print-outs of the same data, is easy to fix (I have done this for Molpro already). 3) The script ccget, when given many file names, presently parses each one separately. Should we provide an option to parse them into a single data object? Hope this makes life easier in the future, Karol -- written by Karol Langner Sun Aug 5 23:38:50 EDT 2007 |