From: R. M. <rm...@mh...> - 2005-05-13 22:37:08
|
On Fri, 13 May 2005 20:14:58 +0100, Christophe Rhodes wrote: > rm...@fa... writes: > >> [...snip ...] >> - first: throw an error/condition (maybe only iff the codepoints of >> the string's characters don't fit into 8 bits (taking advantage >> of the code point overlay of ASCII, latin-1 and Unicode). > > OK, this is basically what will happen, apart from the DWIMish part, > which is not really in the spirit of SBCL. (For more information on > this, you might take a look at the PRINCIPLES file distributed with > the sbcl source.) > >> - second: create the hash of the internal representation of the string. >> After all the md5 algorithm is _always_ senitive to the binary >> representation. Will there be a possible case in SBCL where the >> binary representation of to strings equal under string= will >> differ? > > Yes, there are. BASE-STRINGs and (ARRAY CHARACTER (*)) can have the > same contents with different in-memory representations. How is that? Doesn't a base-string consist entirely of base-chars (with code-points <= 127)? How _can_ i construct an array of characters with code-point <= 127 that has a different internal representation? >> If not then i'd vote dor this solution. One drawback of this >> solution: the md5 sum of a string would not necessarily match >> that of a file containing the same string. > > Right, and with the added potential confusion over different in-memory > representations, this is a non-starter. > >> - third (just to make a mathematician nervous): have >> md5sum-sequence accept a keyword :encoding. This would actually >> be backward compatible and (with :default as the default >> encoding) would work as solution 2. > > Backwards compatibility, at this point, is not even remotely > interesting to me. Backwards compatibility isn't what i aim for, but if it can be achieved easily i don't mind :) > I'd much rather get an interface that we can > collectively be happy to support in the long term, than deal with the > headaches involved in supporting half-baked ones. I _hope_ i don't sound stubborn but i somehow miss to see the half- bakedness of this interface [1]. Somehow i expect (sb-md5:md5sum-sequence "Blah") to act equivalent to (sb:md5sum-sequence (string-to-octets "Blah" :encoding :default)) but i might be wrong. > I'm sorry if that > causes our current users problems, but our current users have made > that choice by using a 0.x piece of software where the development > culture is not focussed on interface stability. (Again, see the > PRINCIPLES file, as well as, well, five years of commit logs :-) I never complained about non-stable interfaces. I'm actually _very_ thankful for the Unicode support. I was taken by surprise because the semantics of a public function changed (and from what i understand then missing warning/error was just an accident). Thanks RalfD [1] i purposely won't quote "the other" languages as a reference or guide. > Cheers, > > Christophe > > > ------------------------------------------------------- > This SF.Net email is sponsored by Oracle Space Sweepstakes > Want to be the first software developer in space? > Enter now for the Oracle Space Sweepstakes! > http://ads.osdn.com/?ad_id=7393&alloc_id=16281&op=click |