From: Pascal J.B. <pj...@in...> - 2004-03-15 20:54:14
|
Don Cohen writes: > I think it was clear what I want. > I want a character set with 256 characters, and an encoding that > maps those 1-1 with all of the 8 bit combinations. > I want no multibyte characters. There is no encoding that I know of that maps 1-1 256 8-bit values. For example, ISO-8859-* does not map values between 128 and 159. Even the various IBM-PC OEM encoding leave some codes unmapped. You could invent your own mapping though. > > > All I really want to do is read characters without any errors. > > > > Then provide an error-free file! > I consider a file containing 256 different bytes to be error free. > > > > Somehow I got the impression from impnotes that this could be done > > > with > > > (with-open-file > > > (f "/root/http/log" :external-format > > > (make-encoding :input-error-action :ignore)) ...) > > > but I still get > > > *** - READ from #<INPUT BUFFERED FILE-STREAM CHARACTER #P"/root/http/log" @2>: \illegal character #\U008D > > > > > > It would be even better if I could then figure out what the bytes had > > > been from the characters (1-1 mapping). > > > > You cannot because it depends on the encoding used. > > But make-encoding should be able to make an encoding that does > whatever I want, shouldn't it? You did not pass it a :charset argument with a set of 256 characters. Why should it not return an error? Note that since there exist no character with a code equal to 141, YOU will have to choose an existing character to put in the slot 141 of the :charset argument. Remember that in C, char is actually an integer type! So it's not surprizing that you can store 141 in a char variable. But that does not mean that there exist any character that have a code equal to 141 in any official encoding. You still can invent your own encoding, but you have to specify it (with the :charset key)! > > I have the feeling that what you want to do is to actually read binary > > bytes instead of reading characters. > > That's what I have been forced to do so far. > But that seems absurd. Why not create the obvious encoding and use > characters? A good reason why not would be that it is not portable accros Common-Lisp implementations... Reading bytes is pure Common-Lisp. -- __Pascal_Bourguignon__ http://www.informatimago.com/ There is no worse tyranny than to force a man to pay for what he doesn't want merely because you think it would be good for him.--Robert Heinlein http://www.theadvocates.org/ |