#325 Let's not binary normalise text

open
nobody
None
5
2012-04-11
2012-04-11
Michael Carden
No

At the moment, when xena binary normalises anything, the source material is base64 encoded and this mostly makes sense if the source is undifferentiated binary data. It makes less sense if the source is already text because XML metadata can be wrapped around text and rendered by any text editor or viewer.

So how about we get the plaintext plugin to skip the base64 bit when binary normalising text.

The same might be said of other ASCII or Unicode content such as HTML or XML but this may require further thought because there is something not quite right about having XML wrapped in XML - quite apart from the fact that it would cease to be well formed XML.

Discussion

  • John
    John
    2012-04-11

    Part of the reason behind binary normalising everything that comes through the door is to be able to show that the original data object has not been changed by anything we do to it.

    If we implement this suggestion, we would have to ensure that we can continue to show that the data objects are authentic.

    Would building a mechanism for showing this authenticity be more expensive/complex/risky than just maintaining the status quo?

    I am not suggesting that this feature request is a bad idea, just want to make sure that we remember about the need to prove that we are doing the right thing