Hi,
I think there is a bug with "format" function with accented characters : (format nil "%03s" "é")
return "0é". It should be "00é", isn't it ?
Note : this is not new : the bug (if it is indeed a bug) also existed in 6.3 but I was able to correct it thanks to the difference in results of (length$ "é") and (str-length "é") but now (with 6.4.1) (length$ "é") is not supported anymore.
Last edit: Chaubert Jérôme 2024-04-03
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I checked in fixes for the format function to adjust the width and precision when multibyte characters are present in a string. I noticed when digging through the specifications for conversion characters for the printf function that the behavior for using 0 with the s conversion character isn't specified. On macOS/Xcode and Windows/Visual Studio, the padding uses 0, but on Linux it uses spaces.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks. I let you see.
(In this context I can't use str-length insteed of length$ because in 6.3 (str-length "é") => 1
but (length$ "é") => 2
And I use this difference between two function results to get round the problem of "format" (described above).)
Last edit: Chaubert Jérôme 2024-04-04
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I looked through the format code and I'm going to have to do some more sophisticated parsing of the format string to automatically adjust the lengths to handle multibyte UTF-8 characters. For now, since length$ served as a workaround in prior versions, I added a function called str-byte-length:
Hi,
I think there is a bug with "format" function with accented characters :
(format nil "%03s" "é")
return "0é". It should be "00é", isn't it ?
Note : this is not new : the bug (if it is indeed a bug) also existed in 6.3 but I was able to correct it thanks to the difference in results of (length$ "é") and (str-length "é") but now (with 6.4.1) (length$ "é") is not supported anymore.
Last edit: Chaubert Jérôme 2024-04-03
I'll look into it. You can use str-length where you previously used length$.
I checked in fixes for the format function to adjust the width and precision when multibyte characters are present in a string. I noticed when digging through the specifications for conversion characters for the printf function that the behavior for using 0 with the s conversion character isn't specified. On macOS/Xcode and Windows/Visual Studio, the padding uses 0, but on Linux it uses spaces.
Thank you for the fix. It works perfectly.
About the "0" behaviour : it's not a problem for me. I used "0" to be more explicit in my description but I don't really use this pattern.
Last edit: Chaubert Jérôme 2024-05-01
Thanks. I let you see.
(In this context I can't use str-length insteed of length$ because in 6.3
(str-length "é") => 1
but
(length$ "é") => 2
And I use this difference between two function results to get round the problem of "format" (described above).)
Last edit: Chaubert Jérôme 2024-04-04
Can you give me some news about this problem ? Have you found a solution ?
I looked through the format code and I'm going to have to do some more sophisticated parsing of the format string to automatically adjust the lengths to handle multibyte UTF-8 characters. For now, since length$ served as a workaround in prior versions, I added a function called str-byte-length:
OK. Thanks a lot! I will import your new code and use str-byte-length for my workaround (for now).
The str-byte-length function (together with my workaround of "format" function) works perfectly well. Thanks a lot.