#86 Multi character-sets support...

open
nobody
None
5
2005-07-29
2005-07-29
Olivier Mascia
No

I have had a good test of the current beta as available
from SF download area. I find it superb and appealing,
though I can't do AnyThing with AnyEdit. At the
simplest it doesn't understand my UTF-8 source code
files... While even Microsoft compilers (which are not
my favourites) do as well as many (but not all I agree)
editors do. How could you drop that support ? This is
the first editor or GUI based on Scintilla which does
not have any (at least partial) support for multiple
character sets.

Supporting multiple encodings and character sets is
something which can have impacts in many intimate parts
of the software. One would have expected to have this
straight and correct before anything else fancy.

At the very least, supporting reading and writing to
file in utf-8 encoding, while converting to the current
system character set, would be a first good thing. All
you would have to do while loading a utf-8 file and
finding some characters can't be mapped to the current
system character set would be to warn the user and ask
if it is okay to continue and replace unmappable
characters with a default one. The reverse action
(saving from the editor to file) will have no
limitation. All characters sets that could be used on
screen will convert lossless to utf-8.

Supporting unicode on screen would be better of course,
but the above simple solution would at least help
people who develop on multi-platforms and have to work
on source code edited by AnyEdit on windows and by
other editors on other platforms. UTF-8 is the
'esperanto' of the unicode file codings. Hint: the
above suggested conversions while loading and saving
are *very* easily done using the Windows API
MultiByteToWideChar() and WideCharToMultiByte(). True,
they won't work with UTF-8 for all windows platforms,
Windows 2000 and XP might be required to benefit from
this. But even with this "limitation" (how many
developers still actually does his/her development work
today on anything less than Windows 2000/XP ?), that
would be a nice and decent first step.

Now, if I came to miss that feature in the current
Beta, please tell me and please make it more evident in
next releases if at all possible : I could not find it.

Again, AnyEdit seems to have a big potential. We don't
have the resources here to help immediately with
coding, but might do so in the future. But
multi-character sets support (including utf-8 coding on
I/O) is a must in the year 2005+.

Thanks,

Discussion