Re: UTF8 encoding

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Tue, 2005-07-05 at 12:37 -0700, Ethan Merritt wrote:

> As I understand it, a single character glyph may have more than one legal
> representation in UTF-8.  So =E4 =F6 =FC each have a one-byte representat=
ion
> (\344 \366 \374 if the encoding is iso8559-1 or iso8559-15) and also have
> a multibyte unicode representation (C3A4 C3B6 C3BC).

I think you are confused. In UTF-8 only characters 0-127 are encoded in
one byte. There is no equivalence between UTF-8 and iso8559-* except for
the basic ASCII part.

i.e. any characters with accents, or in foreign scripts are always
multibyte in UTF-8.

see, http://en.wikipedia.org/en/utf-8/

Interestingly gnuplot fails to detect my locale. I have
LANG=3Den_GB.UTF-8, but gnuplot falls back to C. I will investigate
further when I have the time.

Rob

--=20
Robert Hart <en...@no...>
University of Nottingham

This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.

Re: UTF8 encoding

A portable, multi-platform, command-line driven graphing utility

Re: UTF8 encoding