ooRexx does not manage encodings, but that doesn't forbid to use UTF-8 (at least under Windows and Unix, can't tell for other platforms).
You can open your UTF-8 files as text files, read them, rewrite them, there is no loss of bytes.

The BIF and String class work at the byte level :
"my UTF-8 string"~length returns the number of bytes, not the number of characters.
"my UTF-8 string"~pos(superscript1) returns the position in byte-count.

It's ok to have a source file encoded in UTF-8, but only the string constants can contain UTF-8 characters, according the rules described in section "1.10.4. Tokens" in rexxref.pdf

2013/2/13 Peter J. Farley III <pjfarley3@earthlink.net>

I have reviewed the reference manual documentation on the OPEN method, and I do not see any option to specify the encoding of the input stream.


Is it possible to specify the input stream encoding anywhere?  In particular, I have some UTF-8 text produced by Xpdf’s pdftotext utility that I want to read and be able to specify UTF-8 character constants in my Rexx program code (also encoded in UTF-8) to be compared against the UTF-8 input characters.


For instance, a superscript “1” in UTF-8 is actually encoded as two 8-bit characters, hex representation ‘C2B9’X.  In my UTF-8 Rexx code, I would code a string constant to check for that superscript “1” as a quote, a superscript “1” and another quote, which displays as if it is three characters in the UTF-8 editor I use.


If I am barking up the wrong tree here, please just guide me in the right direction.


TIA for any help or RTFM you can provide.






Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.
Oorexx-users mailing list