Menu

#225 Reading binary files (Python3)

acknowledged
nobody
None
normal
major
HaveNotTried
none
0.6.10
library
2019-08-15
2019-07-18
No

Reading binary files fails on Python 3. The error is triggered from PyFoam/RunDictionary/FileBasis.py, where the entire input file is being decoded through utf-8 encoded strings. The parsing of binary files should be done on bytes instead.

To reproduce: open any binary output file using either ParsedParameterFile directly or through SolutionDirectory, and call the getContent() method on it.

system: Linux 4.19 with Python 3.7 and OpenFOAM (ESI) version 18.12.

trace:

~/.local/share/workon/windfarms/lib/python3.7/site-packages/PyFoam/RunDictionary/FileBasis.py in readFile(self)
     97         """ read the whole File into memory"""
     98         self.openFile()
---> 99         txt=self.fh.read()
    100         if PY3 and self.zipped:
    101             txt=str(txt,"utf-8")

/usr/lib/python3.7/codecs.py in decode(self, input, final)
    320         # decode input (taking the buffer into account)
    321         data = self.buffer + input
--> 322         (result, consumed) = self._buffer_decode(data, self.errors, final)
    323         # keep undecoded input until the next call
    324         self.buffer = data[consumed:]

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc1 in position 898: invalid start byte

Discussion

  • Bernhard Gschaider

    • status: new --> acknowledged
     
  • Bernhard Gschaider

    Binary support was developed with 2.x and the nature of str has changed since then. Will build a test and fix it next week

     
  • Bernhard Gschaider

    Fixed it in my development version. Should be in the next release which was due last week but as usual will be delayed for another week

     
  • Johan Hidding

    Johan Hidding - 2019-08-15

    Thanks for your fix! Is there any way I can reach the development version so that I can play around with it? On a second note: I intend to do some arithmetic on the content of these binary files, and parsing them to a plain list and then to a numpy array is slowing down my code considerably. Is there a way to read the data to numpy arrays directly?

     
    • Bernhard Gschaider

      About the numpy-thing. There is no support for this in PyFoam yet. I was thinking about using Cython for writing a parser for large files but as I never really needed it and it is a lot of work and probably not as portable as pure Python I never went through with it. And using pure Python to convert the "encoded binary" of OF to "real binary" in memory and then pointing numpy to it saying "Here are your 600 doubles. Treat them like a 200x3 (vector) array" isn't very fast either (the converting part)
      To be honest: the current binary support wasn't my idea but something a customer requested

       
  • Bernhard Gschaider

    Until recently the development version of PyFoam was on an internal server intermixed with "not for publication" stuff. I recently started a new Mercurial-repo for it that will push to SourceForge (and BitBucket for redundancy) when I do the next release in the next 2 weeks

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.