From: Brian W. <war...@lo...> - 2003-07-17 02:09:50
|
> First I found that the failure of the CS.latin1.s1v testcase is due to a > trivial error in the file. A patch is attached below. With this the > 'make onetest' as suggested by Brian a bit earlier in the thread should > succeed. Unfortunately only without my charset.patch. So we are where we > started. Ah, thanks for the catch. I forgot that clearsigned messages must *always* end with a newline, and if necessary GPG will add one to your message before signing it. Fixed. > The function which causes all the trouble is (standard-display-european). Ahhhh. That helps immensely. The docstring says this: Semi-obsolete way to toggle display of ISO 8859 European characters. This function is semi-obsolete; if you want to do your editing with unibyte characters, it is better to `set-language-environment' coupled with either the `--unibyte' option or the EMACS_UNIBYTE environment variable, or else customize `enable-multibyte-characters'. With prefix argument, this command enables European character display if arg is positive, disables it otherwise. Otherwise, it toggles European character display. When this mode is enabled, characters in the range of 160 to 255 display not as octal escapes, but as accented characters. Codes 146 and 160 display as apostrophe and space, even though they are not the ASCII codes for apostrophe and space. Enabling European character display with this command noninteractively from Lisp code also selects Latin-1 as the language environment, and selects unibyte mode for all Emacs buffers (both existing buffers and those created subsequently). This provides increased compatibility for users who call this function in `.emacs'. I think it's the "selects unibyte mode for all emacs buffers" that's the big difference between your environment and what the test cases were doing. The \201 bytes are what emacs uses in multibyte buffers to mark latin-1 characters. If that same bytestream were interpreted in unibyte mode, you'd probably see the spurious \201 bytes that you get. The "right" fix will probably involve handling unibyte buffers in some special manner (perhaps when the language environment is set to something like Latin-1), or maybe dealing specially with the conversion from unibyte to multibyte and back. The temporary buffer used to read the output of GPG is probably the critical point. I'll try to look into this over the next week. You might also see if there is a less-obsolete replacement for standard-display-european that meets your needs. If you use 'C-x RET l' to set-language-environment to Latin-1 (which would leave emacs in multibyte mode), does Gnus blow up? Do your latin-1 encoded documents look correct? I have a feeling it will be easier to make everything work correctly if we can keep the buffers in multibyte mode. Unibyte mode loses the meta-data that declares which character set is being used. cheers, -Brian |