From: Vidar L. <vi_...@ya...> - 2007-01-24 08:44:21
|
Hi, Klaus. Although the wikipedia article http://en.wikipedia.org/wiki/ISO-2022- JP indicates (in some parts) that iso-2022 uses 8-bits (also linking to http://en.wikipedia.org/wiki/JIS_X_0201), the RFC 1468 (http:// www.ietf.org/rfc/rfc1468.txt) describes the 1-byte roman characters as 7-bit. Thus, 168 in 1-byte mode is not allowed; and even the commandline 'iconv' will complain about that. The correct sequence of characters in iso-2022-jp in your example, using two-byte encoding of the katakana character, would be as follows (taking care to end the string in ASCII): (map 'list 'char-code (convert-string-from-bytes #(65 92 27 40 74 92 27 36 66 37 35 27 40 66) (make-encoding :charset "ISO-2022-JP" :line-terminator :unix :input-error-action #\Null :output-error-action #\Null))) Giving (65 92 165 12451) /vidar Den 23. jan. 2007 kl. 15:50 skrev Klaus Ebbe Grue: > Hi cli...@li..., > > If I type > > (map 'list 'char-code > (convert-string-from-bytes #(65 92 27 40 74 92 168) > (make-encoding > :charset "ISO-2022-JP" > :line-terminator :unix > :input-error-action #\Null > :output-error-action #\Null))) > > then I get > > (65 92 165 0) > > The input #(65 92 27 40 74 92 168) consists of an ascii A (65), an > ascii backslash (92), a shift to Japanese sequence (27 40 74), a > yen sign > (92) and a katakana glyph (168). > > The output (65 92 165 0) consists of an ascii A (65), an ascii > backslash > (92) a yen sign (165) and an invalid character (0). > > So the shift to Japanese sequence (27 40 74) seems to be > recognized, but > katakana (code 161-223) does not seem to be recognized. > > Any suggestions? > > Cheers, > Klaus > > --- > > [grue@thor grue]$ clisp --version > GNU CLISP 2.41 (2006-10-13) (built 3370084031) (memory 3370084648) > Software: GNU C 3.2 20020903 (Red Hat Linux 8.0 3.2-7) > gcc -g -O2 -W -Wswitch -Wcomment -Wpointer-arith -Wimplicit - > Wreturn-type > -Wmissing-declarations -Wno-sign-compare -O2 -fexpensive-optimizations > -falign-functions=4 -DUNICODE -DDYNAMIC_FFI -I. -x none libcharset.a > libavcall.a libcallback.a > /usr/local/lib/libreadline.so -Wl,-rpath -Wl,/usr/local/lib - > lncurses -ldl > -L/usr/local/lib -lsigsegv -lc -L/usr/X11R6/lib > SAFETY=0 HEAPCODES LINUX_NOEXEC_HEAPCODES GENERATIONAL_GC SPVW_BLOCKS > SPVW_MIXED TRIVIALMAP_MEMORY > libsigsegv 2.4 > libreadline 5.1 > Features: > (READLINE REGEXP SYSCALLS I18N LOOP COMPILER CLOS MOP CLISP ANSI-CL > COMMON-LISP > LISP=CL INTERPRETER SOCKETS GENERIC-STREAMS LOGICAL-PATHNAMES > SCREEN FFI > GETTEXT UNICODE BASE-CHAR=CHARACTER PC386 UNIX) > C Modules: (clisp i18n syscalls regexp readline) > Installation directory: /usr/local/lib/clisp/ > User language: ENGLISH > Machine: I686 (I686) thor.yoa.dk [127.0.0.1] > > ---------------------------------------------------------------------- > --- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to > share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php? > page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > clisp-list mailing list > cli...@li... > https://lists.sourceforge.net/lists/listinfo/clisp-list > __________________________________________________ Bruker du Yahoo!? Lei av spam? Yahoo! Mail har den beste spambeskyttelsen http://no.mail.yahoo.com |