I'm currently working on writing a small script in python 3 for importing .mi files from our AFM machine. If I open the .mi files in a text editor the values are in the binary32 format.
As far as I know there are multiple options to decode this in python (https://docs.python.org/3/library/struct.html), but I can't figure out which is the right one.
What decoding format is Gwyddion using, when decoding these .mi files?
What do you mean by ‘decoding format’? It is just a flat array of values.
We read the text header. It says what type of data it is – images, spectra or curve maps – how many channels are there, what are the dimensions and how the data are represented – 16bit, 32bit, floats or text (given on the the last line of the header as BINARY, BINARY_32, ASCII, etc.). And then just read the data. Gwyddion is written in C, so we read binary data directly.
Last edit: David Nečas 2024-08-09
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Sorry for the late reply. Im not that experienced when it comes to dataformats and how to read them in properly. Therefore, Im still stuck on how to read this file in python3.
I managed to kinda read the values of the topography channel, but somehow the values are in the range from 10^-34 to 10^-32, which is unphysical and does not match the values when read in by gwyddion.
The file is the same one attached by my first message and the code is attached below.
Any help is welcome!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You need to read bufferUnit and bufferRange and combine them to get both the factor and the correct power of 10. And first divide the integers by the corresponding power of 2 to transform them to the interval [0, 1], I think.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Okay if I understood correctly, at first the values have to be normalized to [0, 1] and then multiplied with the corresponding factor of the bufferRange. After doing that the values seem to be roughly twice as large but also not exactly twice.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I checked the conversion and we are actually dividing by 2^{n-1}, i.e. pre-transforming to [-1,1] as the integers are signed. Which would make values in Gwyddion 2× larger – not 2× smaller – compared to what I wrote above. In any case, you just need to get the power of 2 right…
But that should be all. We certainly do not do any other transformation.
If you are comparing images with other software, keep in mind the false colour range can differ from the full data range.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for clarifying. Whats the value n in this case?
Moreover, after checking the values a bit more in detail, ~bufferRange/2.5 instead of bufferRange/2 seems to be the factor necessary for scaling, which is quite weird.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
In your case n = 32 because the file is 32bit. Older files are often 16bit.
As for the range, you should be getting the same values as in Gwyddion now. If you are, but the values differ from MI, I have no more advice to give, but I am interested into looking into it.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi everyone,
I'm currently working on writing a small script in python 3 for importing .mi files from our AFM machine. If I open the .mi files in a text editor the values are in the binary32 format.
As far as I know there are multiple options to decode this in python (https://docs.python.org/3/library/struct.html), but I can't figure out which is the right one.
What decoding format is Gwyddion using, when decoding these .mi files?
An example .mi file is attached.
Thanks!
What do you mean by ‘decoding format’? It is just a flat array of values.
We read the text header. It says what type of data it is – images, spectra or curve maps – how many channels are there, what are the dimensions and how the data are represented – 16bit, 32bit, floats or text (given on the the last line of the header as BINARY, BINARY_32, ASCII, etc.). And then just read the data. Gwyddion is written in C, so we read binary data directly.
Last edit: David Nečas 2024-08-09
Sorry for the late reply. Im not that experienced when it comes to dataformats and how to read them in properly. Therefore, Im still stuck on how to read this file in python3.
I managed to kinda read the values of the topography channel, but somehow the values are in the range from 10^-34 to 10^-32, which is unphysical and does not match the values when read in by gwyddion.
The file is the same one attached by my first message and the code is attached below.
Any help is welcome!
Except for some spectroscopy files, the raw data are integers, not floats.
Thanks for your answer. What scaling factor is then applied if the values are stored as integers?
You need to read bufferUnit and bufferRange and combine them to get both the factor and the correct power of 10. And first divide the integers by the corresponding power of 2 to transform them to the interval [0, 1], I think.
Okay if I understood correctly, at first the values have to be normalized to [0, 1] and then multiplied with the corresponding factor of the bufferRange. After doing that the values seem to be roughly twice as large but also not exactly twice.
I checked the conversion and we are actually dividing by 2^{n-1}, i.e. pre-transforming to [-1,1] as the integers are signed. Which would make values in Gwyddion 2× larger – not 2× smaller – compared to what I wrote above. In any case, you just need to get the power of 2 right…
But that should be all. We certainly do not do any other transformation.
If you are comparing images with other software, keep in mind the false colour range can differ from the full data range.
Thanks for clarifying. Whats the value n in this case?
Moreover, after checking the values a bit more in detail, ~bufferRange/2.5 instead of bufferRange/2 seems to be the factor necessary for scaling, which is quite weird.
In your case n = 32 because the file is 32bit. Older files are often 16bit.
As for the range, you should be getting the same values as in Gwyddion now. If you are, but the values differ from MI, I have no more advice to give, but I am interested into looking into it.
Somehow the [0,1] and the [-1,1] normalization are not working perfectly and give me the results that are ~2.5 times too large.
Nonetheless, it works perfectly when the integer values are multiplied with bufferRange/2^(n-1) with n=32.
This yields the exact same results as Gwyddion.
Thanks a lot for your help and especially your hints regarding the factorization!