From: deiva s. <chi...@gm...> - 2009-03-10 04:00:10
|
Hi, When an header containing control characters like form feed, page eject(^L) , vertical tab(^K) etc. is canonicalized using relaxed canonicalization algorithm, vertical tab is converted to a space. But, according to the RFC, stripping/reducing the WSP, unfolding of headers etc. alone are being specified. Nothing regarding the special characters is being told. So, in that case, canonicalization shouldn't disturb the control characters ? Header passed to the canonicalization: Content-Type:*^L*multipart/alternative;*^K* boundary=^M"----=_NextPart_000_0000_JLZRZLJO.OULDWYUC" Result of relaxed canonicalization using sendmail's dkim code: content-type:multipart/alternative; boundary=^M"----=_NextPart_000_0000_JLZRZLJO.OULDWYUC" So, what is the procedure that need to be followed, when control characters are encountered? Thanks, Deiva Shanmuagm |
From: Murray S. K. <ms...@se...> - 2009-03-10 04:22:55
|
On Tue, 10 Mar 2009, deiva shanmugam wrote: > When an header containing control characters like form feed, page > eject(^L) , vertical tab(^K) etc. is canonicalized using relaxed > canonicalization algorithm, vertical tab is converted to a space. ^L and ^K are illegal in headers according to RFC5322. See section 2.2. > But, according to the RFC, stripping/reducing the WSP, unfolding of headers > etc. alone are being specified. Nothing regarding the special characters is > being told. So, in that case, canonicalization shouldn't disturb the control > characters? > > Header passed to the canonicalization: > > Content-Type:*^L*multipart/alternative;*^K* > boundary=^M"----=_NextPart_000_0000_JLZRZLJO.OULDWYUC" The ^L and ^K are not allowed in headers (see above). The ^M is only allowed if it folds headers, which means it would have to be followed by whitespace (section 2.2.3). In your example, it's not. > Result of relaxed canonicalization using sendmail's dkim code: > > content-type:multipart/alternative; > boundary=^M"----=_NextPart_000_0000_JLZRZLJO.OULDWYUC" > > So, what is the procedure that need to be followed, when control > characters are encountered? It's unspecified, since the standard doesn't allow them. I imagine the MTA is using the ^L as whitespace separating the header field name from its value so it just gets dropped; it's discarding the ^K since it's not allowed; and it's keeping the ^M as the next line isn't really a new header so it must be a continuation of the one previous. |