Reading binary files fails on Python 3. The error is triggered from PyFoam/RunDictionary/FileBasis.py
, where the entire input file is being decoded through utf-8
encoded strings. The parsing of binary files should be done on bytes
instead.
To reproduce: open any binary output file using either ParsedParameterFile
directly or through SolutionDirectory
, and call the getContent()
method on it.
system: Linux 4.19 with Python 3.7 and OpenFOAM (ESI) version 18.12.
trace:
~/.local/share/workon/windfarms/lib/python3.7/site-packages/PyFoam/RunDictionary/FileBasis.py in readFile(self) 97 """ read the whole File into memory""" 98 self.openFile() ---> 99 txt=self.fh.read() 100 if PY3 and self.zipped: 101 txt=str(txt,"utf-8") /usr/lib/python3.7/codecs.py in decode(self, input, final) 320 # decode input (taking the buffer into account) 321 data = self.buffer + input --> 322 (result, consumed) = self._buffer_decode(data, self.errors, final) 323 # keep undecoded input until the next call 324 self.buffer = data[consumed:] UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc1 in position 898: invalid start byte
Binary support was developed with 2.x and the nature of str has changed since then. Will build a test and fix it next week
Fixed it in my development version. Should be in the next release which was due last week but as usual will be delayed for another week
Thanks for your fix! Is there any way I can reach the development version so that I can play around with it? On a second note: I intend to do some arithmetic on the content of these binary files, and parsing them to a plain list and then to a numpy array is slowing down my code considerably. Is there a way to read the data to numpy arrays directly?
About the numpy-thing. There is no support for this in PyFoam yet. I was thinking about using Cython for writing a parser for large files but as I never really needed it and it is a lot of work and probably not as portable as pure Python I never went through with it. And using pure Python to convert the "encoded binary" of OF to "real binary" in memory and then pointing numpy to it saying "Here are your 600 doubles. Treat them like a 200x3 (vector) array" isn't very fast either (the converting part)
To be honest: the current binary support wasn't my idea but something a customer requested
Until recently the development version of PyFoam was on an internal server intermixed with "not for publication" stuff. I recently started a new Mercurial-repo for it that will push to SourceForge (and BitBucket for redundancy) when I do the next release in the next 2 weeks