From: Ingvar <in...@he...> - 2007-11-22 07:45:18
|
Paul Khuong writes: > On 11/22/07, Richard M Kreuter <kr...@pr...> wrote: > > "Nikodemus Siivola" writes: > > > > > * How does Python represent strings? The strings you are using below are > > > full unicode strings. If the file is actually ASCII, you're wasting > > > a lot of space -- using base-strings should do better. > > > > Is there a convenient way to get base-strings from a stream? It looks > > as though specifying an element-type of BASE-CHAR to OPEN gets a stream > > whose element-type is CHARACTER, and that ANSI-STREAM-READ-LINE has no > > provision for constructing base-strings to return. Is this the intended > > behavior? > > > > If not, the hack below seems sufficient to get READ-LINE to return > > base-strings on streams opened with (OPEN ... :ELEMENT-TYPE 'BASE-CHAR). > > (Of course, if the file contains extended-chars, errors will be > > signalled.) Is this worth putting in after this month's release? > > This seems like a bad thing: strings returned by read-line can be > side-effected. Having read-line sometimes return normal strings (in > which any character can be stored) and sometimes base-strings (which > have restrictions on the element characters) depending on the input > stream's element-type, which is quite a dynamic property, would make > type propagation and writing correct code harder. If it is actually > useful, a completely separate function would be a better idea. To be honest, I thought READ-LINE did return base-strings when reading from a file with element-type BASE-CHAR and strings when reading from files filled with unrestricted characters. I don't have any code whose correctness relies on this, but I do have code that uses more than necessary storage that way (of course, I only specify element-types where I know I need them). I think this is one of those cases where one can argue about what is expected, what is correct and what is best for teh implementor and not reaching any firm conclusion. //Ingvar |