#2070 Linker uses wrong output filename

closed-fixed
Ben Shi
linker (61)
sdld
8
2015-07-13
2012-08-22
No

When I try to create a Ȼyx.ihx I get a H.ihx instead. sdcc seems to create the correct .lk file and invoke the linker, but the linker just uses a different output filename! Giving this above-average priority, since it might result in another file being overwritten.

I see this problem in the linker that comes with current sdcc 3.2.1 #8079. I use Debian GNU/Linux.

Philipp

Discussion

  • .lk file from sdcc that the linker doesn't process correctly

     
    Attachments
    • labels: --> linker
    • summary: Linker uses different output filename --> Linker uses wrong output filename
     
  • Maarten Brock
    Maarten Brock
    2012-08-25

    What is the box character in front of yx.ihx? Have you tried to use a different name with regular ASCII characters only?

     
  • U+023B LATIN CAPITAL LETTER C WITH STROKE

    When I use a plain C instead of Ȼ, the linker works as expected.

    Philipp

    P.S.: For some other non-ASCII characters such as ü and Ü I get an error message instead:

    > sdcc -mz80 testü.rel
    ?ASlink-Error-<cannot open> : "testC<.rel"

     
  • Maarten Brock
    Maarten Brock
    2013-12-29

    Philipp,

    Can you test again with attached patch applied? I have no idea how to create files with Unicode characters in the filename.

    Maarten

     
    Attachments
  • This isn't a real solution, as aslink now complains about perfectly fine filenames being invalid. At least it is no longer a silent failure.

    Maybe for now change the error message to something like "aslink is so broken it can only handle a very restricted subset of file names" or so? And leave the bug report open until we have a better fix?

    However, the patch probably still allows too much: Looking at fndidx() in lkmain.c, even some ascii characters (e.g. ':') in filenames will cause problems, depending on where the occur.

    Philipp

     
  • Maarten Brock
    Maarten Brock
    2013-12-29

    It's kind of funny how this stroked C is displayed as a square box by Chrome on Windows XP and as intended by Chrome on Windows 7.

    But can any C program that doesn't use wchar_t (but only char) be expected to support Unicode characters? Or is nul-termination the only rule that stays valid?

    A ':' in a filename will be treated as directory separator like '/' and '\' as they can denote DOS drives (and also have a special DECUS meaning I think).

     
  • Erik Petrich
    Erik Petrich
    2013-12-30

    My best guess is that the non-ASCII character is UTF-8 encoded, and then something is stripping bit 7 of the characters. The Unicode character U+023B would be encoded in UTF-8 as the character 0xc8 followed by 0xbb. With bit 7 cleared, the apparent filename would be "H;yx.ihx". Since the ";" character denotes the start of a comment, the rest of the filename is ignored.

    At a minimum, the get() function in lklex.c needs to not strip bit 7, and then the ctype array in lkdata.c needs to be extended to properly categorize 8-bit characters (128 LETTER entries, perhaps).

     
  • IMO, any program should support any filename the system supports. When ls and cd support it, so should our linker.
    But if this is a bigger issue, we might postpone it, and use your patch for the release, so it at least isn't a silent failure any more.
    Of course, IMO, the linker should also be able to deal with colons, slashes and backslashes in filenames and only treat the one that is a directory separator on the system as a diretory separator.

    Also, IMO, we shouldn't mess with filenames ourselves as much as we do here. Better just use standard functions, such as basename() and dirname(). Unfortunately, while these are POSIX functions, they are AFAIK not avilable on Windows. Windows has _splitpath() instead, though.

    Philipp

     
  • Ben Shi
    Ben Shi
    2015-03-25

    • Category: --> sdld
     
  • Ben Shi
    Ben Shi
    2015-07-13

    Fixed in reversion #9284.

    testü.rel, Ȼyx.rel, test:.rel, and test\ .rel are all supported.

    test/.rel is not, since unix does not allow '/' in file name.
    test;.rel is not, since sdld use ; for comments.

     
    Last edit: Ben Shi 2015-07-13
  • Ben Shi
    Ben Shi
    2015-07-13

    • status: open --> closed-fixed
    • assigned_to: Ben Shi