From: Jan J. <je...@po...> - 2009-09-04 22:40:53
|
William S Fulton wrote: > Jan Jezabek wrote: > >> William S Fulton wrote: >> >>> Hi Jan >>> >>> I was just looking at your case-insensitive patches for COM. This >>> approach stores the lowercase version of the symbol name in the symbol >>> table. I was thinking a better approach would be to store the name >>> without changing the case as by bastardising the original name it >>> makes it user unfriendly, eg say we have a class MyAceStuff, all >>> usage, documentation, messages regarding this symbol will now be >>> myacestuff, which is a lot less readable or identifiable or >>> recognisable in the code being wrapped. >>> >>> I suggest instead you leave sym:name as it is and modify the code >>> where it gets added to the symbol table to make a case insensitive >>> symbol name check so that symbols with the same name but different >>> case then can't be added to the symbol table. You are likely then not >>> to have to make the changes you had to in lang.cxx nor com.cxx. >>> >>> One minor and rather ironical point, is to please stick with the >>> convention of all lowercase names for attribute names. So >>> "sym:casePreservingName" should be "sym:casepreservingname" :) Short >>> attribute names are best for performance, but if you go ahead with my >>> proposed changes, you won't need this attribute name anymore, so that >>> saves thinking about a good shorter name! >>> >>> William >>> >>> >> Hi William, >> >> Thanks for looking at that :). You're most probably right; I considered >> both approaches and thought at first that the approach I've taken might >> be the better option, but after going through all this 'lang.cxx' hell I >> don't think so anymore :] >> >> One problem that I have is how to organize the 'current' symbol tables >> (and symbol tables in general) so that it uses case insensitive >> matching. One obvious solution is to store the symbols there in lower >> case, but this has one drawback that code like this might not work: >> Getattr(current, Getattr(n, "sym:name")). I haven't sen code like this, >> so I don't know if it's a big problem. >> >> Another option is to create a special type of a hash map, which computes >> the hash based on a lower case version of the string and performs >> case-insensitive comparison of the keys. If the target is case >> insensitive, then this hash map would be used for the symtabs. I think >> this might be the most elegant solution and unless you see some major >> problems with it, I would like to go ahead and try it. >> >> > > I think the right way to handle this entirely in symbol.c when adding a > symbol to the table. If the case insensitive symbol already exists then > fail to add it. With regard to implementation, it is a bit trickier than > it could be because the code uses Getattr() to see if the symbol already > exists. I'd think about adding in a Getattrcase() to DOH which ignores > case, like strcasecmp() does for strcmp(). Hopefully you only need to > change a few Getattr() to Getattrcase() calls in symbol.c and that is it. > > William > Hi William, I have looked at the possibility of creating Getattrcase. There are a few problems here: first of all there would also need to be a Setattrcase, so that when an element is stored its case is not taken into account when generating the hash. But the bigger problem is that the DOH objects compute their hash values themselves. Getattrcase and Setattrcase could take extra care if they get a DOH String or C string as argument, but this would get destroyed once resize is called on the hash map. For a moment I was thinking about creating a new DOH type representing a case-insensitive, case-preserving string (that is a string whose hashval and cmp operations do not take case into account, but the case is preserved), but this might be overkill. Maybe something like this would be acceptable - the symtab would store a lower case version of the symbol name, while sym:name would preserve the case. This way the output would still be user-friendly (preserving the original case). There would however be the inconsistency that I mentioned earlier, that Getattr(current, Getattr(n, "sym:name")) would not work. This should not be much of a problem however - the only application of this idiom that I can think of is to find the symbol that is being overloaded by the symbol at node n, but for this we have "sym:overname". What do you think? Thanks, Jan |