Re: [GD-General] Unicode
Brought to you by:
vexxed72
From: Thatcher U. <tu...@tu...> - 2003-11-19 19:42:21
|
On Nov 19, 2003 at 07:55 +0100, Nicolas Romantzoff wrote: > > - MBCS: "Multi Byte Character Sets", using a variable number of > characters depending on the first one. > That's exactly the kind of things that drives me nuts: inventing a > stupid thing for badly engineered older things to continue working. But > hey, that's life. There, you cannot tell the size of a character, > however, the system is providing you with functions for that. > Basically you are ALWAYS pointing to the first byte of the character > (otherwise everything is broken). Given that byte, you can tell the size > of the character (mbclen or something like that), incrementing the > pointer will then give you the next character. Last char is 0. > Note that it is IMPOSSIBLE to go backward unless you know the string > first character address. I believe this is wrong, w/r/t UTF-8. This is one of its design features. You can safely start in the middle of a string, as well as go backwards. Though counting characters is not as simple as with w_char_t or single-byte encodings. -- Thatcher Ulrich http://tulrich.com |