On Thu, 19 May 2005, Martin Leopold wrote:
> Hi all.
> I posted a comment on problems with filename handling during debug on
> may 3rd, regarding byg no. 1068030. I asked for some comments and posted
> a hack for discussion with no response. This time my problem is in the
> sdcdb, but the problem is the same and I think it would be advantageous
> with a consistent solution.
I don't think adding the hyphen to the permitted identifier characters is
a good idea, since it is also used by the subtraction operator. It also
leaves the same problem for all the other special characters.
> The issue that has popped up a few timesis the use of "special" chars in
> filenames, and by special this means period, hyphen and other rather
> commen chars. To me it doesn't seem like the code handles this
> gracefully, which is highly annoying.
> Specifically I have two issues after compiling with --debug and try to
> load the output in sdcdb:
> 1. sdcdb tries to load a .ihx/.asm/.c file based on the "base" of the
> name given on the commandline. If the filename has any special chars
> this fails horribly - neither .ihx, .asm, or .c file are loaded
> properly. The filenames are constructed naively looking for the first
> period, which fails if there is more than one period in the name.
> 2. The second problem is slightly more complicated. As far as I can tell
> the problem lies in the way the .adb/.cdb files mentiones file names.
> It seems to me that the filenames are abstracted into "modules" and the
> problem is that in the .cdb/.adb files the original filenames are
> mangled to suppress certain chars. So my file "app.mangle" is turned
> into a module called app_mangle". This turns out to be a problem when
> sdcdb tries to load app_mangle.c which does not exist.
> This mangling is not consistent the linker "L" statements use the
> original names without mangling.
> So back to the question from the previous mail: How is this problem
> solved most elegantly? It seems to me that most of the problems with
> filenames I've run into so far have been caused by mixing filenames,
> which can contain all kinds of nasty chars, with identifiers, that
> don't. So mangling filename to "modules" seems like a good idea, but
> this needs to be consistent through all the tools (asxxx, sdcdb, ...)
> and there must be some unique way of identifying each module.
> After reading some of the code my suggestion is to extend the .adb/.cdb
> format and to refer to real file names only once and then use the
> "module names" as handles to those file names. Specifically extending
> the "M" lines to contain the module name and the corresponding filename.
There is still the problem that this is not an invertable mapping. If a
project had two source files, poor-name.c and poor_name.c, both files
would be labelled as module "poor_name".
I think instead of mangling by replacing special characters with
underscores, they should be escaped with a C-style backslash and octal
number. The assembler's lexer can be updated to handle the backslash in an
identifier without any ambiguity because a backslash following an
identifier is not currently allowed in its grammer. I assume sdcdb could
appropriately demangle, but this is not part of the sdcc package that I am
> I couldn't find any documentation on the .adb/.cdb fileformats so I took
> some notes while reading the code. The following is my understanding of
> the format - maybee this should be completed and make it's way to the
> documentation somewhere?
There is a link on the SDCC home page to "CDB Format" under
Documentation. Here's the URL: