I've spent most of the afternoon trying to figure out where the possible
failure cases are, and I have to admit that I'm more confused than when I
One possibility is during clearsigning: the bytes sent to GPG for signing
must match the bytes sent to the recipient when they verify the signature.
Variables like coding-system-for-write and set-buffer-process-coding-system
control any potential conversion from the internal representation used in
the plaintext buffer to the bytes sent to GPG.
One well-known problem is the MUA being clever about determining character
sets and performing some kind of conversion after the signature has been
made. quoted-printable is the usual culprit here, and it gets worse if some
MTA along the way decides it needs to convert the text somehow. The usual
fix is to preemptively encode the message down to some kind of
least-common-denominator level first, so that no gateway will have a reason
to alter it. And/or to use a MIME sub-part so you get a faithful 8-bit
reproduction of the plaintext at the far end. But I don't think that's what
your bug is.
Best I can determine, emacs' native character representation gets most
latin1 accented letters as 8-bit characters. When you tell emacs to use
"no-conversion" or "raw-text" as a process-write coding system, a U-umlat
comes out as 0xfc. In contrast, if you use "C-x RET c utf-8 M-| hd RET" to
write out the buffer with a UTF-8 coding system, the U-umlat is written as
two bytes, 0xc3 0xbc .
Now, if the buffer that you're doing the mc-sign command from has somehow
gotten its coding-system-for-write set to use something other than the
default no-conversion, then message will get translated into some other
coding system before being handed off to GPG, and the wrong data will be
signed. I haven't been able to figure out how it could get set this way.. I
don't think set-language-environment would change the value used for
Now I understand why some people push Unicode so strongly :).
Can you give me a detailed description of what you did to cause the invalid
signature? It sounds like it is happening on the sending side instead of the
receiving end, which is outside the current unit test framework (it can only
test decryption and verification right now).
PS: I've made a list of 5 places where non-ascii text might interact badly
with mc-gpg.el . Two are in key lookup, two are in clearsigned messages, and
one is in normal encrypted messages. I'd be grateful if anyone who regularly
uses keys which are named with non-ascii characters, or
clearsigns/encrypts/verifies messages with non-ascii characters, could take
a look at these and let me know if mailcrypt does the right thing.
1: Try using non-ascii key names as search strings. Might need to set
coding-system-for-write in mc-gpg-lookup-key.
2: Verify that non-ascii key names are found and displayed properly, even
when searching with ascii strings.
3: Verify that clearsigned messages with non-ascii characters (in buffers
marked correctly, with a non-null coding system) are passed correctly to
4: Verify that clearsigned messages with non-ascii characters (in buffers
marked incorrectly, with a raw-text coding system) are passed correctly.
5: Verify that encrypted messages with non-ascii plaintext are retrieved