Small Device C Compiler (SDCC) / Bugs / #2577 Escaped hex code in char string incorrectly encoded

#2577 Escaped hex code in char string incorrectly encoded

Status: closed-rejected

Owner: Philipp Klaus Krause

Labels: None

Category: Front-end

Priority: 5

Updated: 2017-01-16

Created: 2017-01-16

Creator: alvin

Private: No

sdcc 3.6.5 #9833 (MINGW32)

I've found that hex character codes encoded in character strings cause the strings to be improperly defined in memory.

Example:

const unsigned char hall_valids[42] = "\x01ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789.~ {";

sdcc -mz80 -S test.c

Result:

_hall_valids:
    .db 0xef
    .ascii "GHIJKLMNOPQRSTUVWXYZ0123456789.~ {"
    .db 0x00
    .db 0x00
    .db 0x00
    .db 0x00
    .db 0x00
    .db 0x00
    .db 0x00

The 0x01 byte that should appear at front is changed to 0xef and the initial part of the array A-F is missing.

If I encode that byte in octal:

const unsigned char hall_valids[42] = "\001ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789.~ {";

The string is correctly encoded.

Discussion

alvin - 2017-01-16

A litte further research and it turns out this is not a bug.

The C standard specfies that an octal or hexadecimal character escape uses the longest valid hex string or octal string following the initial \x or \0 to form the hex or octal constant. This is exactly what is happening above.

I was under the impression that following \x exactly two hex digits are read and following the \0 exactly three octal digits are read but that only works in a world where characters are 8-bits. Now I'm wondering if this changed at some point or if I always had this wrong :P

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Philipp Klaus Krause - 2017-01-16

status: open --> closed-rejected

assigned_to: Philipp Klaus Krause
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Philipp Klaus Krause - 2017-01-16

I checked the ISO C90 and ISO C11 standards, and they both agree. So if this ever changed in C it must have been in pre-standard times. And if it ever changed in SDCC it was a bugfix.

Philipp

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Escaped hex code in char string incorrectly encoded

The Small Device C Compiler (SDCC), targeting 8-bit architectures

Searches

Help

#2577 Escaped hex code in char string incorrectly encoded

Discussion