From: jcm@SDF.ORG - 2011-12-04 15:48:44
|
Is there a tutorial somewhere on writing non-ascii characters to a file? I have latin-1 encoded characters in a db, and I'm not sure how to write those out. I can use flexi-streams:string-to-octets and back, but how can I use with-open-file to write the correct characters to a text file? Sorry if this is a simple question, but this is my first foray into encoding, and I'm sticking with SBCL to do it. |
From: <ded...@ya...> - 2011-12-04 15:54:38
|
>Is there a tutorial somewhere on writing non-ascii characters to a file? > >I have latin-1 encoded characters in a db, and I'm not sure how to write >those out. I can use flexi-streams:string-to-octets and back, but how can >I use with-open-file to write the correct characters to a text file? > >Sorry if this is a simple question, but this is my first foray into >encoding, and I'm sticking with SBCL to do it. > you can either say :external-format or use :element-type In the first case you specify encoding, in the second type you simply use (unsigned-byte 8) and write octets |
From: Nikodemus S. <nik...@ra...> - 2011-12-04 16:23:04
|
On 4 December 2011 17:48, <jc...@sd...> wrote: > Is there a tutorial somewhere on writing non-ascii characters to a file? > > I have latin-1 encoded characters in a db, and I'm not sure how to write > those out. I can use flexi-streams:string-to-octets and back, but how can > I use with-open-file to write the correct characters to a text file? So, first you get stuff from the DB. If it arrives as raw octets you can just use pass :ELEMENT-TYPE '(UNSIGNED-BYTE 8) WITH-OPEN-FILE to write it to file using its original (in this case latin-1) encoding. If you must write it in, say :UTF-8 instead, you need to first convert it to a string using eg. SB-EXT:OCTETS-TO-STRING with :EXTERNAL-FORMAT :LATIN-1 and then proceed as in the next paragraph. If it arrives as correctly decoded string, pass :EXTERNAL-FORMAT :UTF-8 to WITH-OPEN-FILE instead. (Or :LATIN-1, or whatever encoding you want to write it in.) If it arrives as an /incorrectly/ decoded string, then you need to figure out where it goes wrong, and make sure the right external format is used there. Cheers, -- nikodemus |
From: jcm@SDF.ORG - 2011-12-06 00:04:25
|
Thank you both for your answer. The initial issue I had was quickly solved. I'm trying to use this knowledge for a web app as well. I have to present a web page in a number of languages, and accept input. I might even have to allow for (as an example) someone at an arabic configured machine to enter their name in mandarin. I would assume I need everything in UTF-8 at that point. But how can I get those characters to appear correctly in the terminal, file or web browser? For latin-1, it's fine. I can see the brazilian characters in my user's name. But will the same approach handle non-roman characters just as easily? |