objecticon / Tickets / #18 wcs

Comment has been marked as spam.
Undo

View and moderate all "tickets Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Tickets"

Anonymous - 2014-03-31

Originally posted by: grshiplett

for a defense of wide-char types as in utf-16, see various from India ; output in utf-8 is often seen a different/separate issue from internally processing as utf-16 where storage as utf-16 is not an issue.
There is of course resistance : example might be excellent notepad++ for Windows which remains ANSI/utf-8 and endian-oriented only.

*Originally posted by:* [grshiplett](http://code.google.com/u/grshiplett/) for a defense of wide-char types as in utf-16, see various from India ; output in utf-8 is often seen a different/separate issue from internally processing as utf-16 where storage as utf-16 is not an issue. There is of course resistance : example might be excellent notepad++ for Windows which remains ANSI/utf-8 and endian-oriented only.

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "tickets Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Tickets"

Anonymous - 2014-04-02

Originally posted by: r.parl...@gmail.com

Hello Robert

Thanks for the suggestion. I think utf-16 would be a poor choice for an internal format, since it can't represent the entire unicode range (at least not without giving up random access, which is the only thing going for it). Also, the nice thing about utf-8 is that it provides a bridge between unicode and conventional string types - string(u), for ucs u, is a more-or-less instant operation.

Could one not simply write a ucs string to utf-16 conversion procedure? There are some conversion procs in the file lib/main/text.icn which may show the correct outline to follow - they have to be done quite carefully in order to be efficient for long strings.

Kind regards
R

*Originally posted by:* [r.parl...@gmail.com](http://code.google.com/u/108428722892893255658/) Hello Robert Thanks for the suggestion.  I think utf-16 would be a poor choice for an internal format, since it can't represent the entire unicode range \(at least not without giving up random access, which is the only thing going for it\).  Also, the nice thing about utf-8 is that it provides a bridge between unicode and conventional string types - string\(u\), for ucs u, is a more-or-less instant operation. Could one not simply write a ucs string to utf-16 conversion procedure?  There are some conversion procs in the file lib/main/text.icn which may show the correct outline to follow - they have to be done quite carefully in order to be efficient for long strings. Kind regards R

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Comment has been marked as spam.
Undo

View and moderate all "tickets Discussion" comments posted by this user

Mark all as spam, and block user from posting to "Tickets"

Anonymous - 2014-04-03

Originally posted by: grshiplett

Hi,

thanks for the suggestion ... I'll take a look.

I use a few languages that do use utf-16, but I don't know what the future
will bring - one of of them accepts a very wide variety of source encodings
( some programmers in India were not happy to use UTF-32 for their local
languages and I recall their enthusiasm for utf-8 )

*Originally posted by:* [grshiplett](http://code.google.com/u/grshiplett/) Hi, thanks for the suggestion ... I'll take a look. I use a few languages that do use utf-16, but I don't know what the future will bring - one of of them accepts a very wide variety of source encodings \( some programmers in India were not happy to use UTF-32 for their local languages and I recall their enthusiasm for utf-8 \)

Add attachments
Cancel
You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

New Attachment:

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

wcs

Searches

Help

#18 wcs

Discussion