From: Peter P. <ro...@ri...> - 2020-05-26 13:57:47
|
On Tue, May 26, 2020 at 11:36:59AM +0100, Richard Kimber via Fetchmail-users wrote: > I'm running fetchmail -> procmail -> local directories read by Claws on > Ubuntu 20.04 > > I'm having some problems filtering email messages by Subject using > procmail > > "Normal" text Subjects of the form: > Subject: News on Coronavirus > are no problem. But there are a lot of messages nowadays that are > displayed in Claws as: > > Subject: Update from GOV.UK – Coronavirus Statutory Sick Pay ... etc > > but where the subject in the message itself is actually in the form: > > Subject: > =?UTF-8?Q?Update_from_GOV.UK_=E2=80=93_Coronavirus_Statutory_Sick_P?= > =?UTF-8?Q?ay_Rebate_Scheme:_service_availability_and_issues?= > (the above is all one line) > > Does anyone know if there is some generic way of ensuring that messages > are passed to procmail in the "normal" format so that I may reliably and > straightforwardly construct procmail matching rules (i.e. without > having to inspect the message source each time)? Unfortunately, this is the most "normal" of the formats that you are going to get with e-mail. Since none of the software that generates or reformats e-mail messages along the way has any idea what other software will be used to accept, transfer, or process the message, none of it can rely on anything else being able to process anything other than "simple" plain text with all characters being in the US-ASCII character set. Thus, if somebody wants to include, say, a pretty long dash or chevrons, the only way to do that is to use a special format for encoding them, a special format for saying "this here is a character that is outside the US-ASCII character set, I'll tell you how it is represented in UTF-8, but since its UTF-8 representation itself depends on bytes that are outside the US-ASCII character set, here's a marker that says that the next couple of characters are base64-encoded representations of UTF-8 encoded characters". Anything that wants to process e-mail messages the way they are to be displayed to the end-user should be able to decode MIME-encoded content, including MIME-encoded header fields. And... unfortunately, here we come once again to the "procmail is kind of outdated, it does not really support a couple of things that are sort of essential in the current world, is there any way you might switch to other message-filtering software?" For example, it looks like at least courier-maildrop may be configured to parse MIME-encoded headers: http://courier-mail-server.10983.n7.nabble.com/Filtering-mails-with-UTF-8-headers-td18177.html (and yes, I know that switching the mail filtering program is not trivial at all, but procmail also has other problems, some of which people have become accustomed to working around for decades...) G'luck, Peter -- Peter Pentchev ro...@ri... ro...@de... pp...@st... PGP key: http://people.FreeBSD.org/~roam/roam.key.asc Key fingerprint 2EE7 A7A5 17FC 124C F115 C354 651E EFB0 2527 DF13 |