When annotations (revision id and author name) are returned in UTF-8, like Git does, the width should be carefully measured in characters, not in bytes.
In UTF-8, a single character could be encoded using originally up to 6 bytes. The fact that an RFC restricted the encoding to a maximum of 4 bytes does not change the following problem.
The width allotted to annotations by script source is rather narrow to leave maximum screen space to the source line. This means the annotations must be truncated to fit in their column. The present algorithm, reflecting the limited features of older VCS like CVS, simply count bytes and chop.
This is incorrect with UTF-8 because characters are potentially composed of several bytes and blindly chopping might occur in the middle of a byte sequence for a character, resulting in an invalid UTF-8 run.
A replacement is needed for function length() taking into account the value of parameter 'encoding'. Unicode-related pragma are not deemed appropriate since it would create a dependency on Perl 5.12 and output stream is not always Unicode.