I also did a test relating to my previous suggestions about a way to
preserve intact invalid input at output, later refered to as "UTF-8B"
by Andy Hefner previously, and it seems possible.

There seems to be scarce support around for these encodings, and even less literature about it. I found a couple of references in the Unicode mailing lists and a few blogs entries
Searching for DC80 also reveals similar entries, as DC80-DCFF seems to be the favorite range of characters to encode invalid sequences. I think I could easily code this, but there should be some consensus on its utility.


