Re: [PyMca] EDF file format

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Dear David,

The simplest way to read and to write EDF files is to use the EdfFile.py
module. There are several flavours in the nature. You can use the one
shipped with PyMca or any other one:

https://github.com/vasole/pymca/blob/master/PyMca5/PyMcaIO/EdfFile.py

It is a stand alone file so it can be added to any project without
adding a dependency on PyMca.

Basically you just need to create the file:

from PyMca5.PyMcaIO import EdfFile
edf = EdfFile.EdfFile("yourfile.edf", access="ab+")
edf.WriteImage({}, your_numpy_array)
edf = None # to force to close the file

You should aim at generating one EDF per row of measured spectra.

If your raster experiment is n_rows x n_columns x n_channels you could
do something like:

import numpy
from PyMca5.PyMcaIO import EdfFile
data = numpy.array((n_columns, n_channels), dtype=numpy.float32)
# OPTIONAL: a, b and c are the coefficients of your calibration (c
expected to be 0.0)
ddict = {"Mca a": a,
              "Mca b": b,
              "Mca c": c}
for i in range(n_rows):
    for j in range(n_columns):
       data[j, :] = your_spectrum_data_for_pixel_i_j
    edf = EdfFile.EdfFile("root_name_%05d.edf" % i, access="wb")
    edf.WriteImage(ddic, data)
    edf = None

BTW, in what format are your original data? It would not be surprising
that they are already in a format supported by PyMca. Furthermore, if
that is not the case but you know how to read them from Python it would
cost almost nothing to add native support to them in PyMca.

However, I would strongly recommend you to use HDF5 if you are thinking
about saving stacks or even TIFF instead of EDF (PyMca can read
uncompressed TIFF files as if they would be EDF files).

I guess you are referring to the case where you collect N_CHANNELS
spectra on a regular grid of N_ROWS by N_COLUMNS. In its simplest form,
just a dataset with shape [N_ROWS, N_COLUMNS, N_CHANNELS] would do the
job.  That is the strict minimum. Your user should use latest PyMca
(v5.3.1 at this point) to be sure everything works properly.

You can make the life simpler to your user by adding some additional
conventions. The simpler you will make the life to your user, the harder
will be for you I recommend to add the attribute "interpretation" set to
"spectrum" to indicate that the dataset is a stack of spectra and not a
stack of images. The later would correspond to a stack of the form
[N_IMAGES, N_ROWS, N_COLUMNS] and it would be indicated by a an
attribute "interpretation" set to "image".  For XRF, it is recommended
the  [N_ROWS, N_COLUMNS, N_CHANNELS] arrangement.

If you encapsulate the dataset for each MCA device in a container group,
then you can add more information like the calibration, the live_time or
the elapsed_time of that MCA. For instance:

/whatever_name
/whatever_name/mca_0
/whatever_name/mca_0/data # dataset with shape [nrows, ncolumns,
nchannels] (regular grid) or [nspectra, nchannels] for non-regular grid
/whatever_name/mca_0/calibration # three values corresponding to a, b, c
in energy = a + b * channels + c * channels * channels ( with expected
to be 0 anyways)
/whatever_name/mca_0/channels  # dataset with the channel numbers. If
not supplied, it will be taken as 0, 1, 2, ...., nchannels-1
/whatever_name/mca_0/live_time # dataset if shape nrows * ncolumns with
the actual live time

I have started to improve the available documentation. It is a work in
progress, but if you go the HDF5 route, the link below might be of help:

 http://www.silx.org/doc/PyMca/dev/hdf5/index.html

Best regards, Armando

Best regards,

Armando

On 09/05/2018 10:49, PyMca general purpose mailing list. wrote:
> Dear Armando,
>
> We have written in our lab an acquisition software for our XRF system.
> I would like to directly write the data in EDF format but I can not
> find how to do it. Do you have a routine?
>
> Thanks for your help,
>
> David