#47 DefaultLogger reading StandardOut with UTF8 encoding

open
nobody
Core (11)
5
2009-05-29
2009-05-29
Austin Rappa
No

The DefaultLogger uses the system locale's default encoding when reading the executed tasks process' standard output. While in most cases this is perfectly acceptable but the case may arrise when one is capturing the output that contains cyrillic or east asian characters when in an US codepage where the log's text can become corrupted. The alternative would be under Vista to change the codepage on the commandline first or under XP and earlier to run NAnt using Microsoft's AppLocale which has its own set of challenges.

For example if we were working in an US codepage and building something in another language such as Japanese, Simplified or Traditional Chinese or Brasillian Portuguese we might have a output event with the text:
"Iniciar a instalação do El Programo"
but since our US codepage does not recognize our C with the cedilla or the A with the tilde, NAnt reads the process' standard output as:
"Iniciar a instala├˚±■║o do El Programo"
which as you can see could cause problems while reading and in practice caused issues when attempting to save the log to a SQL database.

To resolve this I made a tiny chage to ExternalProgramBase.cs to read the process' standard output as UTF8. What should happen is the encoding we want would be passed in via a commandline argument for the DefaultLogger much like the MailLogger's 'MailLogger.body.encoding'. The changes made was in the StreamReaderThread_Output() method:

Line: 437
From: StreamReader reader = _stdOut;
To: StreamReader reader = new StreamReader(_stdOut.BaseStream, Encoding.UTF8);

Line: 450
From: StreamWriter writer = new StreamWriter(Output.FullName, doAppend);
To: StreamWriter writer = new StreamWriter(Output.FullName, doAppend, Encoding.UTF8);

Here is the whole updated method:

/// <summary>
/// Reads from the stream until the external program is ended.
/// </summary>
private void StreamReaderThread_Output() {
StreamReader reader = new StreamReader(_stdOut.BaseStream, Encoding.UTF8);
bool doAppend = OutputAppend;

while (true) {
string logContents = reader.ReadLine();
if (logContents == null) {
break;
}

// ensure only one thread writes to the log at any time
lock (_lockObject) {
OutputWriter.WriteLine(logContents);
if (Output != null) {
StreamWriter writer = new StreamWriter(Output.FullName, doAppend, Encoding.UTF8);
writer.WriteLine(logContents);
doAppend = true;
writer.Close();
}
}
}
OutputWriter.Flush();
}

I hope this helps!

Discussion

  • Austin Rappa
    Austin Rappa
    2009-05-29

    Updated ExternalProgramBase.cs to read process standard output as UTF8