[hpoj-devel] Re: LCD encoding

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Joe,

Joe Piolunek wrote:
> Comparing the iso8859-5 chart I found(if it's correct)with Latin1 shows
> several of the "special" characters missing.  By declaring
> characterTranslationMap and characterMap as type wchar_t instead of
> 'unsigned char', I had some success doing the character substitutions
> using the characters' unicode designation.  It would only work though, if
> the needed characters are available on the user's system.
> 
> For example, a section something like this in XojPanel::buildCharacterMap
> could be added for each new region where a native user is willing to
> suggest substitute characters.
> 
> // Due to differences in the encodings, some
> // characters need to be remapped for iso8859-5.
> if (qstrcmp (deviceEncoding, "ISO8859-5") == 0) {
> characterMap[0x10] = 0x3c; // '<'
<skipped>
> // (test) remaps 'A' to one of the Katakana chars.
> // characterMap[0x41] = 0x30b7;
> 
> All of the tests above worked for me. It could be possible to use unicode
> for all of the character substitutions, allowing special remaps for
> (hopefully just a few) different regions.
> 
> What I've described sounds a little too easy. Do you know of any problems
> with doing the remapping using unicode designations?

Can you describe how exactly did you implement translation of special
characters _and_ encoding convertion?
QChar(characterTranslationMap[ (unsigned char)string[i] ]) in the patch I
sent you does that job pretty well, converting special chars to unicode
from latin1 rather that from e.g. iso8859-5. And having '>>' and such in
the resulting unicode string doesn't pose a problem at all.
Mapping special chars from a wchar_t characterTranslationMap, which content
is dependent on the selected deviceEncoding, seems a bit too much to me.

I'm reposting the patch to the mailing list for anyone else who may be
interested in it. I works without a hitch for iso8859-1 and iso8859-5
showing all special characters correctly.
What bothers me is that the current convertion procedure will probably not
work for multibyte encodings if HP devices ever make use of them. I would
appreciate any information and ideas about that.

> I don't know which would be better - having the end-user specify the
> encoding ("xojpanel -devenc ISO8859-5"), or regional charset ("xojpanel
> -cyrillic"). What's your opinion on this?

The original idea was that -devenc could be useful not only with cyrillic
but other non-latin charsets as well. We might make -cyrillic (or -charset
cyrillic) an alias for -devenc ISO8859-5 for convenience reasons assuming
that it's the only encoding used by HP for that purpose.

-- 
Dmitry Vukolov