When there are Unicode characters within the HTTP
request (such as in the content body payload of the
request), then LWP::Protocol::http (and probably
others) fails to properly compute the content body
length.
I have found a public post from Sept 2001 that seems to
clearly describe the problem, though apparently no
permanent remedy was ever made to LWP:
http://archive.develooper.com/perl5-
porters@perl.org/msg63478.html
The problem usually exhibits itself as the Content-
Length header of the request being submitted containing
the number of characters (not bytes). Yet, the entire
body that gets properly gets sent (equalling a number of
bytes that exceeds the number in the Content-Length).
The remote webserver the only allows the promised
number of bytes to be read by the CGI script (or
whatever), causing the rest of the body to be discarded.
I've verified that this issue exists in LWP version 5.53
and also the current version (LWP version 5.65);
presumably it exists in all versions in between.
One real-world example of when this problem occurs is
when using Frontier::Client (an XMLRPC client that
utilizes LWP for sending/receiving responses), and at
least one of the RPC method's arguments contains a
Unicode/UTF8 string. The remote XMLRPC server
interprets the operation as an invalid request because it
does not read the full body size, causing it to miss
some of the closing XML tags of the request.
Patch that seems to resolve the problem for me (against /usr/lib/perl5/site_perl/5.6.*/LWP/Protocol/http.pm)