Re: [Passwordsafe-devel] Enhancement Topics (Format and Usage Bugs)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On Tue, 08 May 2007 02:53:27 -0400, Wolfgang Keller <91...@gm...> wrote:
>
> The definition is still to weak or misleading. Without being an expert
> in this, I know that UTF-8 may come with something they call "BOM" as a
> prefix token. Moreover, UTF-8 encoding is a bit-format, not a text
> format and I would not expect that null bytes are to be excluded from
> being a valid part of it. So I suggest the following text as definition
> for "Text":
>

Hi Wolfgang and all,

BOMs ("Byte Order Marks") may be used with UTF-16 or UTF-32 to indicate
that a code unit sequence is serialized in either big or little endian
order.  UTF-8 is byte oriented, and has no need for BOMs.

Let's just delegate the responsibility to the appropriate party, with the
appropriate references.

    Text fields are stored using the UTF-8 encoding scheme (see definition
    D-39 of the Unicode Standard 4.0 at
    http://www.unicode.org/versions/Unicode4.0.0/ch03.pdf).  Note that a
    Unicode string (D-29a) does not contain any terminating NUL character
    that might exist in a C language implementation; consequently, no
    NUL character is stored or counted as part of the field length.  I.e.,
    the ASCII string "Hello World" is stored as a single block, with the
    field length set to 11.

We could strike the second sentence, because it is implicit from the first,
but given that I have been recently bitten by an incompatibility involving
the null character recently, I'd like to see it included.

Frank

-- 
Frank Pilhofer, fp...@fp...

Re: [Passwordsafe-devel] Enhancement Topics (Format and Usage Bugs)

Popular easy-to-use and secure password manager

Re: [Passwordsafe-devel] Enhancement Topics (Format and Usage Bugs)