The problem is that, as the FoX API says, "It is impossible to implement IO of non-ASCII documents in a portable fashion using standard Fortran 95, and it is impossible to handle non-ASCII data internally using standard Fortran strings. A fully unicode-capable FoX version is under development, but requires Fortran 2003.", with the consequence that
"FoX will only process documents consisting of nothing but US-ASCII data. It will accept documents labelled with any single byte character set which is identical to US-ASCII in its lower 7 bits (for example, any of the ISO-8859 charsets, or UTF-8) but an error will be generated as soon as any character outside US-ASCII is encountered. (This includes non-ASCII characters present only be character entity reference)"
It turns out that I can't read files produced or saved by XCE! They have three non-ASCII characters at the beginning, bytes ef bb bf. I can open the files say with Notepad (or XMLSpy) and save them, then these characters are gone. But the code I wrote needs to accept input from any other code that uses the same format (a new universal interchange data format for Ion Beam Analysis, see http://idf.schemas.itn.pt/ for the schema and some still basic documentation).
Can anyone tell me what these characters are, please?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, I write a Fortran code, and I am introducing XML files to it by using the FoX XML Fortran Library http://www1.gly.bris.ac.uk/~walker/FoX/
The problem is that, as the FoX API says, "It is impossible to implement IO of non-ASCII documents in a portable fashion using standard Fortran 95, and it is impossible to handle non-ASCII data internally using standard Fortran strings. A fully unicode-capable FoX version is under development, but requires Fortran 2003.", with the consequence that
"FoX will only process documents consisting of nothing but US-ASCII data. It will accept documents labelled with any single byte character set which is identical to US-ASCII in its lower 7 bits (for example, any of the ISO-8859 charsets, or UTF-8) but an error will be generated as soon as any character outside US-ASCII is encountered. (This includes non-ASCII characters present only be character entity reference)"
It turns out that I can't read files produced or saved by XCE! They have three non-ASCII characters at the beginning, bytes ef bb bf. I can open the files say with Notepad (or XMLSpy) and save them, then these characters are gone. But the code I wrote needs to accept input from any other code that uses the same format (a new universal interchange data format for Ion Beam Analysis, see http://idf.schemas.itn.pt/ for the schema and some still basic documentation).
Can anyone tell me what these characters are, please?
Hi nunoni,
What you are seeing is the byte order mark (EFF) symbol expressed in a multibyte sequence of three.
To work around this problem, go to XML>Encoding and select US-ASCII or disable byte-order-mark generation for UTF-8 under Tools>Options.
If this doesn't work for you, please give me a shout.
Best,
Gerald
Hi Gerald,
thanks, that was that!
Nuno