From: Sam S. <sd...@gn...> - 2002-09-11 13:33:39
|
> * In message <A159B2B20DBCD211B722006008964A7501AB1D77@EXCH-SVR> > * On the subject of "RE: RE: [clisp-list] CLISP an German charcters" > * Sent on Wed, 11 Sep 2002 09:59:57 +0200 > * Honorable jen...@hm... writes: > > I have a function: > (defun get-content-from-file (filename) > (with-open-file=20 > (stream filename) > (let ((lines))=20 > (do ((line (read-line stream) (read-line stream nil 'eof))) > ((eq line 'eof) 'eof) > (setq lines (append lines (list (trim line))))) > (reduce #'(lambda (s1 s2) (concatenate 'string s1 " " s2)) lines)))) >=20 > (defun trim (s) > "Spaces, Tabs and newlines stripped from the beginning and the end" > (string-trim '(#\Space #\Tab #\Newline) s)) replace APPEND with PUSH and then NREVERSE for improved performance. > and get a text with german chars like this: 'L\204nge'. If I type I get > a 'L=E4nge' in this emacs-buffer. >=20 > With some texts I get an error: > *** - Character #\u2022 cannot be represented in the character set > CHARSET:CP850 > May be it's this character: '\225' when do you get the error? during input from the file? (type "help" at the break prompt). if yes, you should look at CUSTOM:*DEFAULT-FILE-ENCODING* <http://clisp.cons.org/impnotes.html#def-file-enc> or supply :EXTERNAL-FORMAT argument to WITH-OPEN-FILE. if the error occurs when the output of GET-CONTENT-FROM-FILE is printed to the *STANDARD-OUTPUT*, you need to modify CUSTOM:*TERMINAL-ENCODING* <http://clisp.cons.org/impnotes.html#term-enc>. See also <http://clisp.cons.org/clisp.html#opt-enc>. > Because I see #\u I guess unicode and try the following function: > (defun get-content-from-file (filename) > (with-open-file=20 > (stream filename :external-format (ext:make-encoding :charset "UTF-8" > :input-error-action :ignore)) never, ever use :*-ERROR-ACTION :IGNORE unless you know for certain that this is what you really want. > (let ((lines))=20 > (do ((line (read-line stream) (read-line stream nil 'eof))) > ((eq line 'eof) 'eof) > (setq lines (append lines (list (trim line))))) > (reduce #'(lambda (s1 s2) (concatenate 'string s1 " " s2)) lines)))) >=20 > Now I don't get an error, but I don't get any non-ascii charcter. They > are ingnored (I think because :input-error-action :ignore) are you sure your file is encoded with UTF-8 and not, say ISO-8859-1? --=20 Sam Steingold (http://www.podval.org/~sds) running RedHat7.3 GNU/Linux <http://www.camera.org> <http://www.iris.org.il> <http://www.memri.org/> <http://www.mideasttruth.com/> <http://www.palestine-central.com/links.html> Who is General Failure and why is he reading my hard disk? |