From: Matt K. <kau...@cs...> - 2013-05-19 22:20:57
|
Thank you very much for getting back to me so quickly. That helps, but I'd like to be able to read in code 10 using the function READ-CHAR, and I don't see how to do that in CLISP, even though I can do it in Allegro CL, CCL, CMUCL, LispWorks, SBCL, and GCL. My sample file contains six characters as follows, where the line break consists of #\Return followed by #\Newline: "x y" Below is a log showing how I get #\Return (code 10) using read-char in those other lisps, but not CLISP. Any suggestions? But first, I should mention that I tried the following in CLISP (though the error probably won't surprise you) -- maybe you can suggest an alternative? [1]> (deftype octet () '(unsigned-byte 8)) OCTET [2]> (with-open-file (in #P"test0" :element-type 'octet) (read-char in)) *** - READ-CHAR on #<INPUT BUFFERED FILE-STREAM (UNSIGNED-BYTE 8) #P"test0"> is illegal The following restarts are available: ABORT :R1 Abort main loop Break 1 [3]> Anyhow, here is the log promised above. dunnottar:~/temp% acl9 International Allegro CL Enterprise Edition 9.0 [64-bit Linux (x86-64)] (Jul 11, 2012 14:33) Copyright (C) 1985-2012, Franz Inc., Oakland, CA, USA. All Rights Reserved. This development copy of Allegro CL is licensed to: [TC20122] University of Texas ;; Optimization settings: safety 1, space 1, speed 1, debug 2. ;; For a complete description of all compiler switches given the ;; current optimization settings evaluate (EXPLAIN-COMPILER-SETTINGS). CL-USER(1): (setq *locale* (find-locale "C")) #<locale "C" [:LATIN1-BASE] @ #x100004067b2> CL-USER(2): (let (ch) (with-open-file (in #P"test0") (loop while (setq ch (read-char in nil)) collect (char-code ch)))) (34 120 13 10 121 34 13 10) CL-USER(3): (exit) ; Exiting dunnottar:~/temp% ccl Starting 64-bit CCL Welcome to Clozure Common Lisp Version 1.9-dev-r15542M-trunk (LinuxX8664)! ? (setq ccl:*default-file-character-encoding* :iso-8859-1) :ISO-8859-1 ? (let (ch) (with-open-file (in #P"test0") (loop while (setq ch (read-char in nil)) collect (char-code ch)))) (34 120 13 10 121 34 13 10) ? (quit) dunnottar:~/temp% cmucl CMU Common Lisp snapshot-2013-05 (20D Unicode), running on dunnottar With core: /v/filer4b/v11q001/acl2/lisps/cmucl-snapshot-2013-05-20D-Unicode/lib/cmucl/lib/lisp-sse2.core Dumped on: Sat, 2013-05-11 11:18:42-05:00 on lorien2 See <http://www.cmucl.org/> for support information. Loaded subsystems: Unicode 1.29 with Unicode version 6.2.0 Python 1.1, target Intel x86/sse2 CLOS based on Gerd's PCL 2010/03/19 15:19:03 * (setq *default-external-format* :iso-8859-1) :ISO-8859-1 * (let (ch) (with-open-file (in #P"test0") (loop while (setq ch (read-char in nil)) collect (char-code ch)))) (34 120 13 10 121 34 13 10) * (quit) ; dunnottar:~/temp% lispworks Starting 64-bit Lispworks LispWorks(R): The Common Lisp Programming Environment Copyright (C) 1987-2012 LispWorks Ltd. All rights reserved. Version 6.1.1 Saved by kaufmann as lw-terminal-only, at 26 Nov 2012 15:23 User kaufmann on dunnottar CL-USER 1 > (setq stream::*default-external-format* '(:LATIN-1 :EOL-STYLE :LF)) (:LATIN-1 :EOL-STYLE :LF) CL-USER 2 > (defun our-file-encoding (pathname ef-spec buffer length) (system:merge-ef-specs ef-spec '(:LATIN-1 :EOL-STYLE :LF))) OUR-FILE-ENCODING CL-USER 3 > (setq system::*file-encoding-detection-algorithm* '(our-file-encoding)) (OUR-FILE-ENCODING) CL-USER 4 > (let (ch) (with-open-file (in #P"test0") (loop while (setq ch (read-char in nil)) collect (char-code ch)))) (34 120 13 10 121 34 13 10) CL-USER 5 > (quit) dunnottar:~/temp% sbcl Starting 64-bit SBCL This is SBCL 1.1.4, an implementation of ANSI Common Lisp. More information about SBCL is available at <http://www.sbcl.org/>. SBCL is free software, provided as is, with absolutely no warranty. It is mostly in the public domain; some portions are provided under BSD-style licenses. See the CREDITS and COPYING files in the distribution for more information. * (setq sb-impl::*default-external-format* :iso-8859-1) :ISO-8859-1 * (let (ch) (with-open-file (in #P"test0") (loop while (setq ch (read-char in nil)) collect (char-code ch)))) (34 120 13 10 121 34 13 10) * (quit) dunnottar:~/temp% gcl GCL (GNU Common Lisp) 2.6.8 CLtL1 May 11 2013 16:43:51 Source License: LGPL(gcl,gmp), GPL(unexec,bfd,xgcl) Binary License: GPL due to GPL'ed components: (XGCL READLINE UNEXEC) Modifications of this banner must retain notice of a compatible license Dedicated to the memory of W. Schelter Use (help) to get some basic information on how to use GCL. Temporary directory for compiler files set to /tmp/ >(let (ch) (with-open-file (in #P"test0") (loop while (setq ch (read-char in nil)) collect (char-code ch)))) (34 120 13 10 121 34 13 10) >(quit) dunnottar:~/temp% clisp i i i i i i i ooooo o ooooooo ooooo ooooo I I I I I I I 8 8 8 8 8 o 8 8 I \ `+' / I 8 8 8 8 8 8 \ `-+-' / 8 8 8 ooooo 8oooo `-__|__-' 8 8 8 8 8 | 8 o 8 8 o 8 8 ------+------ ooooo 8oooooo ooo8ooo ooooo 8 Welcome to GNU CLISP 2.49 (2010-07-07) <http://clisp.cons.org/> Copyright (c) Bruno Haible, Michael Stoll 1992, 1993 Copyright (c) Bruno Haible, Marcus Daniels 1994-1997 Copyright (c) Bruno Haible, Pierpaolo Bernardi, Sam Steingold 1998 Copyright (c) Bruno Haible, Sam Steingold 1999-2000 Copyright (c) Sam Steingold, Bruno Haible 2001-2010 Type :h and hit Enter for context help. [1]> (setq custom:*default-file-encoding* (ext:make-encoding :charset 'charset:iso-8859-1 :line-terminator :unix)) #<ENCODING CHARSET:ISO-8859-1 :UNIX> [2]> (let (ch) (with-open-file (in #P"test0") (loop while (setq ch (read-char in nil)) collect (char-code ch)))) (34 120 10 121 34 10) [3]> (quit) Bye. dunnottar:~/temp% Thanks -- -- Matt From: "Pascal J. Bourguignon" <pj...@in...> Date: Sun, 19 May 2013 23:16:14 +0200 Organization: Informatimago Matt Kaufmann <kau...@cs...> writes: > Hi -- > > I maintain an application that is build on top of Common Lisp, which > expects iso-8859-1 for the character encoding. I'd like to set things > up so that on a linux system, my application reads characters from a > file exactly as they were written. But my attempt to do so failed, > dropping a #\Return character, as illustrated by the log below. Is > there something simple I can do to accomplish my goal, or else might > that be the case in future CLISP releases? Note that I did see the > following note at http://www.clisp.org/impnotes/clhs-newline.html: > > Justification. Unicode Newline Guidelines say: “Even if you know > which characters represents NLF on your particular platform, on > input and in interpretation, treat CR, LF, CRLF, and NEL the > same. Only on output do you need to distinguish between them.” > > However, I'm hoping that since I'm using iso-8859-1 rather than a utf > encoding, maybe that justification doesn't need to apply. No, it still applies. Since you want to read codes such as 13 and 10, you should specify an element type of (unsigned-byte 8): [pjb@kuiper :0.0 ~]$ clisp -ansi -norc -q [1]> (deftype octet () '(unsigned-byte 8)) OCTET [2]> (with-open-file (in #P"~/tmp/misc/wang.dos" :element-type 'octet) (let ((buffer (make-array 256 :element-type 'octet))) (read-sequence buffer in) (search #(13 10) buffer))) 29 [3]> (quit) [pjb@kuiper :0.0 ~]$ -- __Pascal Bourguignon__ http://www.informatimago.com/ A bad day in () is better than a good day in {}. You can take the lisper out of the lisp job, but you can't take the lisp out of the lisper (; -- antifuchs ------------------------------------------------------------------------------ AlienVault Unified Security Management (USM) platform delivers complete security visibility with the essential security capabilities. Easily and efficiently configure, manage, and operate all of your security controls from a single console and one unified framework. Download a free trial. http://p.sf.net/sfu/alienvault_d2d _______________________________________________ clisp-list mailing list cli...@li... https://lists.sourceforge.net/lists/listinfo/clisp-list |