Menu

#13 speed up reading of large field files

acknowledged
None
normal
feature
havenottried
none
The Library
general
2010-12-07
2010-08-02
Anonymous
No

Hi Bernhard,

as you know, I like pyFoam a lot :-) What I am missing is the speed of c++ for reading/parsing and adjusting large field files in combination with pyFoamRead/WriteDictionary. For large meshes with an internalField it takes for ever to adjust boundaries. I assume, that it is not possible with plain python to speed this up!? It would be nice, if you have any plans to include for the heavy stuff additional c++ or directly openfoam functions!?
Maybe something like weave/inline could be used for this:
http://lbolla.wordpress.com/2007/04/11/numerical-computing-matlab-vs-pythonnumpyweave/

What do you think?

Regards!
Fabian

Discussion

  • Bernhard Gschaider

    The problem is not the manipulation of the data once it is in memory (for this the benchmarks in the link are applicable). What takes so long is the part to get the data into memory. The parser. TheŒ flexibility of the parser takes its toll here. A simpler parser would be faster.

    A possible approach would be to use the option "listLebgthUnparsed" of the ParsedParameterFile-class. This allows to specify "If you encounter a list that is longer than 42 elements and says so with a prefix don't parse it but just read the text as is and if asked to write it out". This option is already used (if I remember it correctly) in pyFoamCaseReport.py and it speeded things up tremendously (try the tool with --internal on a real case and then raise the threshold above the number of cells)

    Downside of this approach is that long fields can not be manipulated on a per-element bases (only overwritten as a whole)

    Would this be sufficient?

    Don't want to add any C++-code as this would complicate the deployment of PyFoam.

    Another option would be the "other" PyFoam -project (the Python-bindings for OpenFOAM)

     
  • Anonymous

    Anonymous - 2010-08-04

    Thanks for the info! I will try the pyCaseReport-option.
    I am not too familiar with the other bindings-project yet, but an optional combination of both looks to me quite interestings :-)

     
  • R A Smith

    R A Smith - 2010-10-31

    Much of the problem comes before the parser; idiomatic Python uses line-oriented file access with lots of extra logic (at least through python 2.5).Œ I accelerated one of my pyFoam-based scripts dramatically by reading in the entire field file at once:

    pFile=ParsedParameterFile(filename,backup=False,noBody=True)
    ff=gzip.open(filename+".gz")
    f=ff.read().splitlines()
    ff.close()
    for line in f:
    Œ do_simple_parsing_leaving_internal_field_in_Array()
    get_results(pFile.content,Array)

    Thanks to the administrator for saving me from the need to write get_results() in C++.

     
  • Anonymous

    Anonymous - 2010-10-31

    As I said in the above note: the flexibility of the parser has its price: performance.

    I don't quite understand your example: the file has already been read and parsed, so how does reading it again speed up things? Or are you suggesting that it is faster to read the file into memory and THEN let the parser work on it?

     
  • R A Smith

    R A Smith - 2010-11-04

    Sorry I wasn't clear. In the example I first parse the header with a pyFoam call, since this makes it easy to do input validation and branch based on structures known to pyFoam. The bulk read with secondary parsing is a hack to deal quickly with the big internal field structure. (My application loops over hundreds of files with millions of cells.) I wanted to indicate that (1) today a user can do hacks like this to get more out of pyFoam, and (2) in the future, you might consider using bulk reads (and perhaps streamlined parsing of large structures) to speed up unhacked pyFoam processing. A quick and dirty parser did not solve my problem without the bulk read.

     
  • Anonymous

    Anonymous - 2010-12-07

    Hello to you both,

    I still do not get it :-( As shortly mentioned in Munich, using the proposed option (maybe I am not doing the right setting) does not help:
    pyFoamCaseReport.py --short-bc-reportŒ --long-field-threshold=10 .

    A 'small' case with around 2.4 Mio cells takes for ever...
    Do you have a hint, what I am doing wrong?

    In addition I would like to try Smithra's example; do you have a bit longer script to understand your approach better!?

    Best Regards!
    Fabian

     

Log in to post a comment.