gamedevlists-general Mailing List for gamedev (Page 36)
Brought to you by:
vexxed72
You can subscribe to this list here.
2001 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(3) |
Oct
(28) |
Nov
(13) |
Dec
(168) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2002 |
Jan
(51) |
Feb
(16) |
Mar
(29) |
Apr
(3) |
May
(24) |
Jun
(25) |
Jul
(43) |
Aug
(18) |
Sep
(41) |
Oct
(16) |
Nov
(37) |
Dec
(208) |
2003 |
Jan
(82) |
Feb
(89) |
Mar
(54) |
Apr
(75) |
May
(78) |
Jun
(141) |
Jul
(47) |
Aug
(7) |
Sep
(3) |
Oct
(16) |
Nov
(50) |
Dec
(213) |
2004 |
Jan
(76) |
Feb
(76) |
Mar
(23) |
Apr
(30) |
May
(14) |
Jun
(37) |
Jul
(64) |
Aug
(29) |
Sep
(25) |
Oct
(26) |
Nov
(1) |
Dec
(10) |
2005 |
Jan
(9) |
Feb
(3) |
Mar
|
Apr
|
May
(11) |
Jun
|
Jul
(39) |
Aug
(1) |
Sep
(1) |
Oct
(4) |
Nov
|
Dec
|
2006 |
Jan
(24) |
Feb
(18) |
Mar
(9) |
Apr
|
May
|
Jun
|
Jul
(14) |
Aug
(29) |
Sep
(2) |
Oct
(5) |
Nov
(4) |
Dec
|
2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(11) |
Sep
(9) |
Oct
(5) |
Nov
(4) |
Dec
|
2008 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
(34) |
Jun
|
Jul
(9) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
2016 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
From: Brett B. <res...@ga...> - 2003-11-27 03:07:31
|
I forgot to include a useful link to anybody interested in looking at this problem if it tickles your fancy. Steve Summit, who maintains the C Programming FAQ, has a page with a few links on discussions on this over the years. http://www.eskimo.com/~scs/C-faq/varargs/wackyideas.html I also forgot to state the obvious: I posted this to the game development forum because this is the place to find solutions to problems, no matter how ugly, rather than debate pedantic issues. That is different than seeking solutions on software engineering forums such as sweng where portability, standards, etc. are the focus. So if anybody has a working solution that has been used, or even some ideas, I would love to hear them. Surely somebody has done this? :-) Brett |
From: Brett B. <res...@ga...> - 2003-11-27 02:21:47
|
Hiya, I'm trying to bind our scripting engine to our game engine and having trouble with variadic functions; namely I can't figure out how to call a C-implemented variadic function from our scripting engine where the parameters are not known at compile time of the engine. Of course ideally I could implement this in C itself but it seems I need to implement some assembly code to push the parameters correctly, and so maybe I shouldn't worry about the whole implementation, but I would like to understand how this works. For purposes of discussion, I'll assume the implementation is for a GCC compatible implementation. Basically, variadic functions are implemented in two steps. 1. The caller (calling the variadic function) simply pushes the parameters and calls the variadic function. 2. The callee (the variadic function) uses the following four basic functions to be able to use variable arguments (GCC compatibility style): void va_start(va_list ap, last); type va_arg(va_list ap, type); void va_end(va_list ap); void va_copy(va_list dest, va_list src); The compiler strips the "..." off the back and makes a normal function out of the variadic one, and the preceeding four functions are then used to access these unknown parameters. The start function takes the last known parameter as a value so at first glance it seems the function knows how many paramters there are because because it can compare the last known argument with the value on the stack. The problem is that since the parameters were pushed in reverse order how would it know how many parameters there are? If it were in direct order, you could take the stack pointer minus the stack pointer at the last parameter position (which is provided) and the diffference would be the number of arguments. But since they are reversed how does the va_arg function know when to stop? I wrote a test sample for the PlayStation 2, GameCube and PC (Intel) and it seems that the caller does nothing more than push the values onto the stack as a normal function would, making sure not to use registers. Has anybody else run into this before? How did you solve it? Thanks, Brett |
From: <cas...@ya...> - 2003-11-20 23:23:42
|
Hi, there seems to be some confusion around Unicode, that I will try to clear up. The Unicode character set is a standard that provides different encoding formats. There are two different kind of encodings, some represent the full Unicode character set and others don't. The "loosless" formats are utf8, utf16 and utf32. utf8 and utf16 are multibyte charcter sets, that means that a character can be represented by multiple bytes. For example, in utf8 a character may be take 1, 2, 3, up to 6 bytes. The nice thing about utf8 is that it does not contain embedded zeros, so you can still use strlen, strcpy, strdup, etc. However, in this case strlen does not provide the lenght but the size of the string. utf16 usually takes a word, however some characters need two words. The second word is usually called surrogate and is only needed by some strange characters, usually old languages that are not used anymore. Windows NT and Java only support a subset of unicode called UCS2, that is utf16 without the surrogate. Windows XP on the other side is supposed to support surrogates. Finally, the last encoding is utf32 (or ucs4) that uses a 32bits and represents the full unicode character set. Which representation you choose mainly depends on your application. Web applications usually use utf8, because you can reuse the existing code and most of the net is written using ASCII characters, so utf8 turns out to be the most efficient. I currently use ucs2 internally in my applications, and that's probably what most games will need. This is an oversimplification, so check this out for more info: http://www.unicode.org/faq/ Hope that helps, -- Ignacio Castaño cas...@ya... ___________________________________________________ Yahoo! Messenger - Nueva versión GRATIS Super Webcam, voz, caritas animadas, y más... http://messenger.yahoo.es |
From: Thatcher U. <tu...@tu...> - 2003-11-20 06:06:36
|
My opinion on the auto* tools: very unintuitive, very hacky, very awkward to develop with, very very very (almost magically) portable amongst Unix-like systems. So it's fantastic for distributing source tarballs. But that stuff isn't really compatible with MS's Windows tools, let alone consoles. POSH seems much more game-developer-friendly, in that you just drop it in to virtually any build environment. Plus it's Public Domain, so no license sweating. -Thatcher On Nov 20, 2003 at 02:35 +1100, Aaron Drew wrote: > I haven't really had much experience with autoconf/automake/configure it but > it seems like this sort of stuff would fit nicely into that system rather > than compete with it. Maybe someone more familiar with this system could > provide some more insight? > > On Thu, 20 Nov 2003 05:04 am, Crosbie Fitch wrote: > > > From: Garett Bass > > > I'm not sure I understand what you mean by "Unsized" here. If you're > > > defining char8 to be uint8, then its size is 8 bits. > > > > 'Unsized' as in "The code accommodates any size, but requires storage for > > something of a particular type" > > > > int is unsized in the sense its size is not known implicitly. > > > > In my example char_utf8 is unsized (even though defined in terms of the > > sized types char8 and uint8). > > > > > > > > ------------------------------------------------------- > > This SF.net email is sponsored by: SF.net Giveback Program. > > Does SourceForge.net help you be more productive? Does it > > help you create better code? SHARE THE LOVE, and help us help > > YOU! Click Here: http://sourceforge.net/donate/ > > _______________________________________________ > > Gamedevlists-general mailing list > > Gam...@li... > > https://lists.sourceforge.net/lists/listinfo/gamedevlists-general > > Archives: > > http://sourceforge.net/mailarchive/forum.php?forum_id=557 > > -- > - Aaron > > "Today's mighty oak is just yesterday's nut that held its ground." > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us help > YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Gamedevlists-general mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-general > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_id=557 -- Thatcher Ulrich http://tulrich.com |
From: Brian H. <ho...@py...> - 2003-11-20 05:31:26
|
On Thu, 20 Nov 2003 14:35:51 +1100, Aaron Drew wrote: > I haven't really had much experience with > autoconf/automake/configure it but it seems like this sort of stuff > would fit nicely into that system rather than compete with it. > Maybe someone more familiar with this system could provide some > more insight? The problem is that those require an external tool and configuration process. POSH is specifically designed to be drop-in, compile and go with minimal configuration. Brian |
From: J C L. <cl...@ka...> - 2003-11-20 04:24:55
|
On Wed, 19 Nov 2003 11:13:16 -0600 Garett Bass <gt...@st...> wrote: > I'm not sure I understand what you mean by "Unsized" here. If you're > defining char8 to be uint8, then its size is 8 bits. Ahem. Bytes are not required to be 8 bits, they merely usually are. > Not being too familiar with unicode, I find this confusing. I thought > that "Unicode" was a multibyte format with no set number of bytes per > character, ie. a single asian character may be represented by four > bytes while the subsequent character is represented by two. Correct, but there is a max width and in order to be able to handle indexed aggregates like arrays when in non-string form they are assumed to be of constant width. Its in string form that they are packed. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. cl...@ka... He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. |
From: J C L. <cl...@ka...> - 2003-11-20 04:21:58
|
On Wed, 19 Nov 2003 11:58:18 -0600 Garett Bass <gt...@st...> wrote: > Which leaves me wondering, how do you figure out where one character > ends and the next begins? Recursive descent. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. cl...@ka... He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. |
From: J C L. <cl...@ka...> - 2003-11-20 04:19:10
|
On Wed, 19 Nov 2003 11:49:22 -0500 Brian Hook <ho...@py...> wrote: > On Wed, 19 Nov 2003 09:05:54 -0500, J C Lawrence wrote: >> You are familiar with glibC's inttypes.h and stdint.h? They would >> seem to cover this space moderately well. > Yes, unfortunately they are not universally available. True. However they address and solve most of the basic space in a well defined and thoroughly implication-understood way. If you're license is LGPL friendly it would seem to make sense to take that as a starting point. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. cl...@ka... He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. |
From: Aaron D. <ri...@in...> - 2003-11-20 03:37:41
|
I haven't really had much experience with autoconf/automake/configure it but it seems like this sort of stuff would fit nicely into that system rather than compete with it. Maybe someone more familiar with this system could provide some more insight? On Thu, 20 Nov 2003 05:04 am, Crosbie Fitch wrote: > > From: Garett Bass > > I'm not sure I understand what you mean by "Unsized" here. If you're > > defining char8 to be uint8, then its size is 8 bits. > > 'Unsized' as in "The code accommodates any size, but requires storage for > something of a particular type" > > int is unsized in the sense its size is not known implicitly. > > In my example char_utf8 is unsized (even though defined in terms of the > sized types char8 and uint8). > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us help > YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Gamedevlists-general mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-general > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_id=557 -- - Aaron "Today's mighty oak is just yesterday's nut that held its ground." |
From: Jani K. <ka...@ga...> - 2003-11-19 19:46:55
|
wchar_t is platform dependent, but on the other hand you probably don't need Unicode string literals at all. You can use ASCII-7 as source string and read the (Unicode) translations in different languages of specific strings from a file, so you can store internally the string in whatever UTF format you want and still use only ASCII-7 in source code, something like: String translatedUnicodeString = translator->translate( "File {0} not found", filename ); For simple UTF-8/16/32/ASCII-7 converter have a look at http://catmother.sourceforge.net source package and it's source file lang/UTFConverter.cpp. It's very simple and doesn't handle special cases correctly like incorrect UTF-data as guided in the Unicode standard, but it should serve as simple encoder/decoder. For more complete Unicode implementation ICU is the weapon of choice, a very complete and very high-quality library, but also very heavy-weight and probably overkill for a typical game. (just my opinion of course) Regards, Jani ----- Original Message ----- From: "Paul Reynolds" <pa...@so...> To: <gam...@li...> Sent: Wednesday, November 19, 2003 9:01 PM Subject: RE: [GD-General] Unicode > I can't claim to be an expert by any means. I've just started digging into > it all myself. The actual implementation of Unicode support is extremely > compiler dependent > (http://oss.software.ibm.com/icu/docs/papers/unicode_wchar_t.html). GCC and > VC++ both have a data type declared wchar_t that you use for working with > unicode strings. A string literal is declared with a leading 'L': > > wchar_t* str = L"This is my fancy string"; > > From what I understand so far, both compilers used fixed size for all > characters that are big enough to hold any code point. (GCC is 32-bit, and > VC++ is 16-bit). So pointer arithmetic and sizeof(whcar_t) are still > reliable. > > There's lots of more info about the Unicode standard > http://www.unicode.org/standard/principles.html... > > "Character encoding standards define not only the identity of each character > and its numeric value, or code point, but also how this value is represented > in bits. > > The Unicode Standard defines three encoding forms that allow the same data > to be transmitted in a byte, word or double word oriented format (i.e. in 8, > 16 or 32-bits per code unit). All three encoding forms encode the same > common character repertoire and can be efficiently transformed into one > another without loss of data. The Unicode Consortium fully endorses the use > of any of these encoding forms as a conformant way of implementing the > Unicode Standard. > > UTF-8 is popular for HTML and similar protocols. UTF-8 is a way of > transforming all Unicode characters into a variable length encoding of > bytes. It has the advantages that the Unicode characters corresponding to > the familiar ASCII set have the same byte values as ASCII, and that Unicode > characters transformed into UTF-8 can be used with much existing software > without extensive software rewrites. > > UTF-16 is popular in many environments that need to balance efficient access > to characters with economical use of storage. It is reasonably compact and > all the heavily used characters fit into a single 16-bit code unit, while > all other characters are accessible via pairs of 16-bit code units. > > UTF-32 is popular where memory space is no concern, but fixed width, single > code unit access to characters is desired. Each Unicode character is > encoded in a single 32-bit code unit when using UTF-32. > > All three encoding forms need at most 4 bytes (or 32-bits) of data for each > character." > > -----Original Message----- > From: gam...@li... > [mailto:gam...@li...]On Behalf Of > Garett Bass > Sent: Wednesday, November 19, 2003 9:58 AM > To: gam...@li... > Subject: RE: [GD-General] Unicode > > > Paul, > > It was after reading Joel's article that I understood Unicode to use an > indeterminate number of bytes per character. Specifically: > > "In UTF-8, every code point from 0-127 is stored in a single byte. Only code > points 128 and above are stored using 2, 3, in fact, up to 6 bytes." > > Which leaves me wondering, how do you figure out where one character ends > and the next begins? > > Thanks in advance, > Garett > > > -----Original Message----- > From: gam...@li... > [mailto:gam...@li...]On Behalf Of > Paul Reynolds > Sent: Wednesday, November 19, 2003 11:31 AM > To: gam...@li... > Subject: RE: [GD-General] Feedback wanted on POSH > > > This is a pretty good overview of text encoding*: > http://www.joelonsoftware.com/articles/Unicode.html > > I'd say everyone working on a shipping game should really evaluate if raw > char* strings are really a good idea. If you've ever had to localize a 7-bit > ascii game, you'll know what I'm talking about. Other software industries > have been embracing unicode for quite some time. > > * - For the record, I'm not a Joel Spolsky fanboy. I can usually take him or > leave him. ;o) > > -----Original Message----- > From: gam...@li... > [mailto:gam...@li...]On Behalf Of > Garett Bass > Sent: Wednesday, November 19, 2003 9:13 AM > To: gam...@li... > Subject: RE: [GD-General] Feedback wanted on POSH > > > // Crosbie Fitch wrote: > // Hmmn maybe the chars should be like this: > > You will notice that POSH doesn't provide a char typedef, presumably because > sizeof(char) == 1 in ANSI C, as mentioned in another post. I imagine that > defining your own integer character type will require an explicit cast > anytime you want to use a string manipulation function, which seems a little > awkward. Of course, if you use C++ and STL, then you can always create a > std::basic_string<char_utf8>, or whatever. > > // typedef char8 char_ascii; // Unsized char able to contain 7bit ASCII > // typedef char8 char_utf8; // Unsized char able to contain... > // typedef char16 char_ucs2; // Unsized char able to contain... > > I'm not sure I understand what you mean by "Unsized" here. If you're > defining char8 to be uint8, then its size is 8 bits. > > // typedef char_utf8 char_unicode; // Unsized char suitable for Unicode > // typedef char_unicode character; // Unsized char suitable for any text > > Not being too familiar with unicode, I find this confusing. I thought that > "Unicode" was a multibyte format with no set number of bytes per character, > ie. a single asian character may be represented by four bytes while the > subsequent character is represented by two. > > Regards, > Garett > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us help > YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Gamedevlists-general mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-general > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_id=557 > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us help > YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Gamedevlists-general mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-general > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_id=557 > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us help > YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Gamedevlists-general mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-general > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_id=557 > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us help > YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Gamedevlists-general mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-general > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_id=557 |
From: Thatcher U. <tu...@tu...> - 2003-11-19 19:44:51
|
On Nov 19, 2003 at 11:01 -0800, Paul Reynolds wrote: > > UTF-8 [...] > > UTF-16 [...] > > UTF-32 [...] > > All three encoding forms need at most 4 bytes (or 32-bits) of data for each > character." Apparently UTF-8 uses up to 6 bytes for certain characters. -- Thatcher Ulrich http://tulrich.com |
From: Thatcher U. <tu...@tu...> - 2003-11-19 19:42:21
|
On Nov 19, 2003 at 07:55 +0100, Nicolas Romantzoff wrote: > > - MBCS: "Multi Byte Character Sets", using a variable number of > characters depending on the first one. > That's exactly the kind of things that drives me nuts: inventing a > stupid thing for badly engineered older things to continue working. But > hey, that's life. There, you cannot tell the size of a character, > however, the system is providing you with functions for that. > Basically you are ALWAYS pointing to the first byte of the character > (otherwise everything is broken). Given that byte, you can tell the size > of the character (mbclen or something like that), incrementing the > pointer will then give you the next character. Last char is 0. > Note that it is IMPOSSIBLE to go backward unless you know the string > first character address. I believe this is wrong, w/r/t UTF-8. This is one of its design features. You can safely start in the middle of a string, as well as go backwards. Though counting characters is not as simple as with w_char_t or single-byte encodings. -- Thatcher Ulrich http://tulrich.com |
From: Thatcher U. <tu...@tu...> - 2003-11-19 19:06:40
|
You might want to read these after (or instead of) the Joel article: http://www.cl.cam.ac.uk/~mgk25/unicode.html (the first few sections are packed with info and apply to any OS) http://www.utf-8.com -T On Nov 19, 2003 at 11:58 -0600, Garett Bass wrote: > Paul, > > It was after reading Joel's article that I understood Unicode to use an > indeterminate number of bytes per character. Specifically: > > "In UTF-8, every code point from 0-127 is stored in a single byte. Only code > points 128 and above are stored using 2, 3, in fact, up to 6 bytes." > > Which leaves me wondering, how do you figure out where one character ends > and the next begins? > > Thanks in advance, > Garett > > > -----Original Message----- > From: gam...@li... > [mailto:gam...@li...]On Behalf Of > Paul Reynolds > Sent: Wednesday, November 19, 2003 11:31 AM > To: gam...@li... > Subject: RE: [GD-General] Feedback wanted on POSH > > > This is a pretty good overview of text encoding*: > http://www.joelonsoftware.com/articles/Unicode.html > > I'd say everyone working on a shipping game should really evaluate if raw > char* strings are really a good idea. If you've ever had to localize a 7-bit > ascii game, you'll know what I'm talking about. Other software industries > have been embracing unicode for quite some time. > > * - For the record, I'm not a Joel Spolsky fanboy. I can usually take him or > leave him. ;o) > > -----Original Message----- > From: gam...@li... > [mailto:gam...@li...]On Behalf Of > Garett Bass > Sent: Wednesday, November 19, 2003 9:13 AM > To: gam...@li... > Subject: RE: [GD-General] Feedback wanted on POSH > > > // Crosbie Fitch wrote: > // Hmmn maybe the chars should be like this: > > You will notice that POSH doesn't provide a char typedef, presumably because > sizeof(char) == 1 in ANSI C, as mentioned in another post. I imagine that > defining your own integer character type will require an explicit cast > anytime you want to use a string manipulation function, which seems a little > awkward. Of course, if you use C++ and STL, then you can always create a > std::basic_string<char_utf8>, or whatever. > > // typedef char8 char_ascii; // Unsized char able to contain 7bit ASCII > // typedef char8 char_utf8; // Unsized char able to contain... > // typedef char16 char_ucs2; // Unsized char able to contain... > > I'm not sure I understand what you mean by "Unsized" here. If you're > defining char8 to be uint8, then its size is 8 bits. > > // typedef char_utf8 char_unicode; // Unsized char suitable for Unicode > // typedef char_unicode character; // Unsized char suitable for any text > > Not being too familiar with unicode, I find this confusing. I thought that > "Unicode" was a multibyte format with no set number of bytes per character, > ie. a single asian character may be represented by four bytes while the > subsequent character is represented by two. > > Regards, > Garett > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us help > YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Gamedevlists-general mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-general > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_id=557 > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us help > YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Gamedevlists-general mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-general > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_id=557 > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us help > YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Gamedevlists-general mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-general > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_id=557 -- Thatcher Ulrich http://tulrich.com |
From: Paul R. <pa...@so...> - 2003-11-19 19:01:27
|
I can't claim to be an expert by any means. I've just started digging into it all myself. The actual implementation of Unicode support is extremely compiler dependent (http://oss.software.ibm.com/icu/docs/papers/unicode_wchar_t.html). GCC and VC++ both have a data type declared wchar_t that you use for working with unicode strings. A string literal is declared with a leading 'L': wchar_t* str = L"This is my fancy string"; From what I understand so far, both compilers used fixed size for all characters that are big enough to hold any code point. (GCC is 32-bit, and VC++ is 16-bit). So pointer arithmetic and sizeof(whcar_t) are still reliable. There's lots of more info about the Unicode standard http://www.unicode.org/standard/principles.html... "Character encoding standards define not only the identity of each character and its numeric value, or code point, but also how this value is represented in bits. The Unicode Standard defines three encoding forms that allow the same data to be transmitted in a byte, word or double word oriented format (i.e. in 8, 16 or 32-bits per code unit). All three encoding forms encode the same common character repertoire and can be efficiently transformed into one another without loss of data. The Unicode Consortium fully endorses the use of any of these encoding forms as a conformant way of implementing the Unicode Standard. UTF-8 is popular for HTML and similar protocols. UTF-8 is a way of transforming all Unicode characters into a variable length encoding of bytes. It has the advantages that the Unicode characters corresponding to the familiar ASCII set have the same byte values as ASCII, and that Unicode characters transformed into UTF-8 can be used with much existing software without extensive software rewrites. UTF-16 is popular in many environments that need to balance efficient access to characters with economical use of storage. It is reasonably compact and all the heavily used characters fit into a single 16-bit code unit, while all other characters are accessible via pairs of 16-bit code units. UTF-32 is popular where memory space is no concern, but fixed width, single code unit access to characters is desired. Each Unicode character is encoded in a single 32-bit code unit when using UTF-32. All three encoding forms need at most 4 bytes (or 32-bits) of data for each character." -----Original Message----- From: gam...@li... [mailto:gam...@li...]On Behalf Of Garett Bass Sent: Wednesday, November 19, 2003 9:58 AM To: gam...@li... Subject: RE: [GD-General] Unicode Paul, It was after reading Joel's article that I understood Unicode to use an indeterminate number of bytes per character. Specifically: "In UTF-8, every code point from 0-127 is stored in a single byte. Only code points 128 and above are stored using 2, 3, in fact, up to 6 bytes." Which leaves me wondering, how do you figure out where one character ends and the next begins? Thanks in advance, Garett -----Original Message----- From: gam...@li... [mailto:gam...@li...]On Behalf Of Paul Reynolds Sent: Wednesday, November 19, 2003 11:31 AM To: gam...@li... Subject: RE: [GD-General] Feedback wanted on POSH This is a pretty good overview of text encoding*: http://www.joelonsoftware.com/articles/Unicode.html I'd say everyone working on a shipping game should really evaluate if raw char* strings are really a good idea. If you've ever had to localize a 7-bit ascii game, you'll know what I'm talking about. Other software industries have been embracing unicode for quite some time. * - For the record, I'm not a Joel Spolsky fanboy. I can usually take him or leave him. ;o) -----Original Message----- From: gam...@li... [mailto:gam...@li...]On Behalf Of Garett Bass Sent: Wednesday, November 19, 2003 9:13 AM To: gam...@li... Subject: RE: [GD-General] Feedback wanted on POSH // Crosbie Fitch wrote: // Hmmn maybe the chars should be like this: You will notice that POSH doesn't provide a char typedef, presumably because sizeof(char) == 1 in ANSI C, as mentioned in another post. I imagine that defining your own integer character type will require an explicit cast anytime you want to use a string manipulation function, which seems a little awkward. Of course, if you use C++ and STL, then you can always create a std::basic_string<char_utf8>, or whatever. // typedef char8 char_ascii; // Unsized char able to contain 7bit ASCII // typedef char8 char_utf8; // Unsized char able to contain... // typedef char16 char_ucs2; // Unsized char able to contain... I'm not sure I understand what you mean by "Unsized" here. If you're defining char8 to be uint8, then its size is 8 bits. // typedef char_utf8 char_unicode; // Unsized char suitable for Unicode // typedef char_unicode character; // Unsized char suitable for any text Not being too familiar with unicode, I find this confusing. I thought that "Unicode" was a multibyte format with no set number of bytes per character, ie. a single asian character may be represented by four bytes while the subsequent character is represented by two. Regards, Garett ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Gamedevlists-general mailing list Gam...@li... https://lists.sourceforge.net/lists/listinfo/gamedevlists-general Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=557 ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Gamedevlists-general mailing list Gam...@li... https://lists.sourceforge.net/lists/listinfo/gamedevlists-general Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=557 ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Gamedevlists-general mailing list Gam...@li... https://lists.sourceforge.net/lists/listinfo/gamedevlists-general Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=557 |
From: Nicolas R. <nic...@fr...> - 2003-11-19 18:58:47
|
Hmmmm, Looks like you misunderstood something... There are three ways of storing strings: - SBCS: "Single Byte Character Sets", using only 8-bits character encoding. That's the easiest one... Note that many kind of SBCS are available and they are only compatible on the 0-127 part. - DBCS: "Double Byte Character Sets", using 16-bits character encoding. UNICODE is one of those... - MBCS: "Multi Byte Character Sets", using a variable number of characters depending on the first one. That's exactly the kind of things that drives me nuts: inventing a stupid thing for badly engineered older things to continue working. But hey, that's life. There, you cannot tell the size of a character, however, the system is providing you with functions for that. Basically you are ALWAYS pointing to the first byte of the character (otherwise everything is broken). Given that byte, you can tell the size of the character (mbclen or something like that), incrementing the pointer will then give you the next character. Last char is 0. Note that it is IMPOSSIBLE to go backward unless you know the string first character address. Note also that it is the way Windows is doing the UI/File-system. So basically: // length (number of characters) of a string: unsigned int _strlen(const char* mbstr) { unsigned int ret = 0; while (*mbstr) { ++ret; mbstr += mbclen(mbstr); } return ret; } // size (in bytes) of a string (not including ending null char): unsigned int _strsize(const char* mbstr) { unsigned int ret = 0; unsigned int t; while (*mbstr) { t = mbclen(mbstr); ret += t; mbstr += t; } return ret; } > -----Original Message----- > From: gam...@li... > [mailto:gam...@li...] On > Behalf Of Garett Bass > Sent: Wednesday, November 19, 2003 6:58 PM > To: gam...@li... > Subject: RE: [GD-General] Unicode > > > Paul, > > It was after reading Joel's article that I understood > Unicode to use an indeterminate number of bytes per > character. Specifically: > > "In UTF-8, every code point from 0-127 is stored in a single > byte. Only code points 128 and above are stored using 2, 3, > in fact, up to 6 bytes." > > Which leaves me wondering, how do you figure out where one > character ends and the next begins? > > Thanks in advance, > Garett > > > -----Original Message----- > From: gam...@li... > [mailto:gam...@li...]On > Behalf Of Paul Reynolds > Sent: Wednesday, November 19, 2003 11:31 AM > To: gam...@li... > Subject: RE: [GD-General] Feedback wanted on POSH > > > This is a pretty good overview of text encoding*: > http://www.joelonsoftware.com/articles/Unicode.html > > I'd say everyone working on a shipping game should really > evaluate if raw > char* strings are really a good idea. If you've ever had to > localize a 7-bit ascii game, you'll know what I'm talking > about. Other software industries have been embracing unicode > for quite some time. > > * - For the record, I'm not a Joel Spolsky fanboy. I can > usually take him or leave him. ;o) > > -----Original Message----- > From: gam...@li... > [mailto:gam...@li...]On > Behalf Of Garett Bass > Sent: Wednesday, November 19, 2003 9:13 AM > To: gam...@li... > Subject: RE: [GD-General] Feedback wanted on POSH > > > // Crosbie Fitch wrote: > // Hmmn maybe the chars should be like this: > > You will notice that POSH doesn't provide a char typedef, > presumably because > sizeof(char) == 1 in ANSI C, as mentioned in another post. I > imagine that defining your own integer character type will > require an explicit cast anytime you want to use a string > manipulation function, which seems a little awkward. Of > course, if you use C++ and STL, then you can always create a > std::basic_string<char_utf8>, or whatever. > > // typedef char8 char_ascii; // Unsized char able to contain > 7bit ASCII // typedef char8 char_utf8; // Unsized char able > to contain... // typedef char16 char_ucs2; // Unsized char > able to contain... > > I'm not sure I understand what you mean by "Unsized" here. > If you're defining char8 to be uint8, then its size is 8 bits. > > // typedef char_utf8 char_unicode; // Unsized char suitable > for Unicode // typedef char_unicode character; // Unsized > char suitable for any text > > Not being too familiar with unicode, I find this confusing. > I thought that "Unicode" was a multibyte format with no set > number of bytes per character, ie. a single asian character > may be represented by four bytes while the subsequent > character is represented by two. > > Regards, > Garett > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us > help YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Gamedevlists-general mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-general > Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=557 > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us > help YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Gamedevlists-general mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-general > Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=557 > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us > help YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Gamedevlists-general mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-general > Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=557 > |
From: Crosbie F. <cr...@cy...> - 2003-11-19 18:00:37
|
> From: Garett Bass > I'm not sure I understand what you mean by "Unsized" here. If you're > defining char8 to be uint8, then its size is 8 bits. 'Unsized' as in "The code accommodates any size, but requires storage for something of a particular type" int is unsized in the sense its size is not known implicitly. In my example char_utf8 is unsized (even though defined in terms of the sized types char8 and uint8). |
From: Garett B. <gt...@st...> - 2003-11-19 17:57:37
|
Paul, It was after reading Joel's article that I understood Unicode to use an indeterminate number of bytes per character. Specifically: "In UTF-8, every code point from 0-127 is stored in a single byte. Only code points 128 and above are stored using 2, 3, in fact, up to 6 bytes." Which leaves me wondering, how do you figure out where one character ends and the next begins? Thanks in advance, Garett -----Original Message----- From: gam...@li... [mailto:gam...@li...]On Behalf Of Paul Reynolds Sent: Wednesday, November 19, 2003 11:31 AM To: gam...@li... Subject: RE: [GD-General] Feedback wanted on POSH This is a pretty good overview of text encoding*: http://www.joelonsoftware.com/articles/Unicode.html I'd say everyone working on a shipping game should really evaluate if raw char* strings are really a good idea. If you've ever had to localize a 7-bit ascii game, you'll know what I'm talking about. Other software industries have been embracing unicode for quite some time. * - For the record, I'm not a Joel Spolsky fanboy. I can usually take him or leave him. ;o) -----Original Message----- From: gam...@li... [mailto:gam...@li...]On Behalf Of Garett Bass Sent: Wednesday, November 19, 2003 9:13 AM To: gam...@li... Subject: RE: [GD-General] Feedback wanted on POSH // Crosbie Fitch wrote: // Hmmn maybe the chars should be like this: You will notice that POSH doesn't provide a char typedef, presumably because sizeof(char) == 1 in ANSI C, as mentioned in another post. I imagine that defining your own integer character type will require an explicit cast anytime you want to use a string manipulation function, which seems a little awkward. Of course, if you use C++ and STL, then you can always create a std::basic_string<char_utf8>, or whatever. // typedef char8 char_ascii; // Unsized char able to contain 7bit ASCII // typedef char8 char_utf8; // Unsized char able to contain... // typedef char16 char_ucs2; // Unsized char able to contain... I'm not sure I understand what you mean by "Unsized" here. If you're defining char8 to be uint8, then its size is 8 bits. // typedef char_utf8 char_unicode; // Unsized char suitable for Unicode // typedef char_unicode character; // Unsized char suitable for any text Not being too familiar with unicode, I find this confusing. I thought that "Unicode" was a multibyte format with no set number of bytes per character, ie. a single asian character may be represented by four bytes while the subsequent character is represented by two. Regards, Garett ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Gamedevlists-general mailing list Gam...@li... https://lists.sourceforge.net/lists/listinfo/gamedevlists-general Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=557 ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Gamedevlists-general mailing list Gam...@li... https://lists.sourceforge.net/lists/listinfo/gamedevlists-general Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=557 |
From: Mat N. \(BUNGIE\) <mat...@mi...> - 2003-11-19 17:41:14
|
They are when used as internal data keys. MSN -----Original Message----- From: gam...@li... [mailto:gam...@li...] On Behalf Of Paul Reynolds Sent: Wednesday, November 19, 2003 9:31 AM To: gam...@li... Subject: RE: [GD-General] Feedback wanted on POSH This is a pretty good overview of text encoding*: http://www.joelonsoftware.com/articles/Unicode.html I'd say everyone working on a shipping game should really evaluate if raw char* strings are really a good idea. If you've ever had to localize a 7-bit ascii game, you'll know what I'm talking about. Other software industries have been embracing unicode for quite some time. * - For the record, I'm not a Joel Spolsky fanboy. I can usually take him or leave him. ;o) -----Original Message----- From: gam...@li... [mailto:gam...@li...]On Behalf Of Garett Bass Sent: Wednesday, November 19, 2003 9:13 AM To: gam...@li... Subject: RE: [GD-General] Feedback wanted on POSH // Crosbie Fitch wrote: // Hmmn maybe the chars should be like this: You will notice that POSH doesn't provide a char typedef, presumably because sizeof(char) =3D=3D 1 in ANSI C, as mentioned in another post. I = imagine that defining your own integer character type will require an explicit cast anytime you want to use a string manipulation function, which seems a little awkward. Of course, if you use C++ and STL, then you can always create a std::basic_string<char_utf8>, or whatever. // typedef char8 char_ascii; // Unsized char able to contain 7bit ASCII // typedef char8 char_utf8; // Unsized char able to contain... // typedef char16 char_ucs2; // Unsized char able to contain... I'm not sure I understand what you mean by "Unsized" here. If you're defining char8 to be uint8, then its size is 8 bits. // typedef char_utf8 char_unicode; // Unsized char suitable for Unicode // typedef char_unicode character; // Unsized char suitable for any text Not being too familiar with unicode, I find this confusing. I thought that "Unicode" was a multibyte format with no set number of bytes per character, ie. a single asian character may be represented by four bytes while the subsequent character is represented by two. Regards, Garett ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Gamedevlists-general mailing list Gam...@li... https://lists.sourceforge.net/lists/listinfo/gamedevlists-general Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=3D557 ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Gamedevlists-general mailing list Gam...@li... https://lists.sourceforge.net/lists/listinfo/gamedevlists-general Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=3D557 |
From: Paul R. <pa...@so...> - 2003-11-19 17:30:39
|
This is a pretty good overview of text encoding*: http://www.joelonsoftware.com/articles/Unicode.html I'd say everyone working on a shipping game should really evaluate if raw char* strings are really a good idea. If you've ever had to localize a 7-bit ascii game, you'll know what I'm talking about. Other software industries have been embracing unicode for quite some time. * - For the record, I'm not a Joel Spolsky fanboy. I can usually take him or leave him. ;o) -----Original Message----- From: gam...@li... [mailto:gam...@li...]On Behalf Of Garett Bass Sent: Wednesday, November 19, 2003 9:13 AM To: gam...@li... Subject: RE: [GD-General] Feedback wanted on POSH // Crosbie Fitch wrote: // Hmmn maybe the chars should be like this: You will notice that POSH doesn't provide a char typedef, presumably because sizeof(char) == 1 in ANSI C, as mentioned in another post. I imagine that defining your own integer character type will require an explicit cast anytime you want to use a string manipulation function, which seems a little awkward. Of course, if you use C++ and STL, then you can always create a std::basic_string<char_utf8>, or whatever. // typedef char8 char_ascii; // Unsized char able to contain 7bit ASCII // typedef char8 char_utf8; // Unsized char able to contain... // typedef char16 char_ucs2; // Unsized char able to contain... I'm not sure I understand what you mean by "Unsized" here. If you're defining char8 to be uint8, then its size is 8 bits. // typedef char_utf8 char_unicode; // Unsized char suitable for Unicode // typedef char_unicode character; // Unsized char suitable for any text Not being too familiar with unicode, I find this confusing. I thought that "Unicode" was a multibyte format with no set number of bytes per character, ie. a single asian character may be represented by four bytes while the subsequent character is represented by two. Regards, Garett ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Gamedevlists-general mailing list Gam...@li... https://lists.sourceforge.net/lists/listinfo/gamedevlists-general Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=557 |
From: Paul R. <pa...@so...> - 2003-11-19 17:17:23
|
Actually, the signedness of a "char" can be changed with a flag: GCC: -funsigned-char VC: /J So you can't always assume that when you see "char foo;" that foo is a signed value. -----Original Message----- From: gam...@li... [mailto:gam...@li...]On Behalf Of Aaron Hilton Sent: Wednesday, November 19, 2003 2:16 AM To: gam...@li... Subject: Re: [GD-General] Feedback wanted on POSH If I remember correctly, I believe the PS2's gcc version at the time (I don't recall the exact version) compiled a char as unsigned 8bit, while visual studio was forced to interpret a char as unsigned 8bit. Perhaps the other way around, signed 8bit, but you get the idea. It was one of those mystery bugs that appeared for one day while configuring compile options for xbox vs.net. So explicitly typecasting for goofy types, like char and int, is a good idea. Also, beware that: int - is fastest performing integer type for the platform. So, Opteron, G5/PPC970, and Itanium may likely nudge int from 32bits up to 64bits. char - is the most appropriate character containing data type. So unicode char is 16bit unsigned, old-dos-days was 8bit signed, *nix was 8bit unsigned (?). float - fastest IEEE floating point arithmetic mode. So most everything was 32bits.. now we're migrating to 64bit, so this could change as well. void* - most appropriate pointer type. So dos's void* was 16bit for tiny and small real mode, 32bit for medium and large mode, then protected mode/flat mode rolled around and 32bit (31bit) became norm, now we've been warned not to use the 32nd bit for house keeping because that'll get busted when migrating to 64bit. As far as I know, the only truly standard types are: short - 16bit signed integer long - 32bit signed integer single - 32bit float double - 64bit float (but can be overridden to 80bit float on x86) Unfortunately, we're dealing with a really messy world. Hopefully type definitions have pretty well settled down. When the platform capabilities present themselves, I hope that the C/C++ type definitions will stick. New variations of type definitions could be used instead with the introduction of a compiler flag for friendly names like uint8, uint16, uint32, uint64, int8, int16, int32, int64, float16, float32, float64, float80, float128, vector2int16, vector3int16, vector4int32, vector2float16, etc.. you get the idea. However, that still leaves us with cross platform issues, which can only be solved by type-defining our own stuff. Eventually we'll have to gracefully move C/C++ into the new world of SIMD capabilities. I just hope there is also a plan to employ a much better set of type definitions to go with it. - Aaron. On 19-Nov-03, at 12:38 AM, ma...@ch... wrote: > On Tue, 18 Nov 2003, Garett Bass wrote: > >> Also, I find it misleading to call an unsigned char a "byte", since >> all of >> the char types are technically a byte > > Is it? Isn't the only requirement for a char, that it can contain a > "char", that is a single character from the platforms preferred > character > set? > > Mads > -- Aaron Hilton Software Developer Adaptive Optics Research University of Victoria ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Gamedevlists-general mailing list Gam...@li... https://lists.sourceforge.net/lists/listinfo/gamedevlists-general Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=557 |
From: Garett B. <gt...@st...> - 2003-11-19 17:12:35
|
// Crosbie Fitch wrote: // Hmmn maybe the chars should be like this: You will notice that POSH doesn't provide a char typedef, presumably because sizeof(char) == 1 in ANSI C, as mentioned in another post. I imagine that defining your own integer character type will require an explicit cast anytime you want to use a string manipulation function, which seems a little awkward. Of course, if you use C++ and STL, then you can always create a std::basic_string<char_utf8>, or whatever. // typedef char8 char_ascii; // Unsized char able to contain 7bit ASCII // typedef char8 char_utf8; // Unsized char able to contain... // typedef char16 char_ucs2; // Unsized char able to contain... I'm not sure I understand what you mean by "Unsized" here. If you're defining char8 to be uint8, then its size is 8 bits. // typedef char_utf8 char_unicode; // Unsized char suitable for Unicode // typedef char_unicode character; // Unsized char suitable for any text Not being too familiar with unicode, I find this confusing. I thought that "Unicode" was a multibyte format with no set number of bytes per character, ie. a single asian character may be represented by four bytes while the subsequent character is represented by two. Regards, Garett |
From: Gareth L. <GL...@cl...> - 2003-11-19 16:59:23
|
On a similar note : http://icl.pku.edu.cn/bswen/cpp/c++boost/libs/integer/ -----Original Message----- From: Brian Hook [mailto:ho...@py...] Sent: 19 November 2003 16:49 To: gam...@li... Subject: Re: [GD-General] Feedback wanted on POSH On Wed, 19 Nov 2003 09:05:54 -0500, J C Lawrence wrote: > You are familiar with glibC's inttypes.h and stdint.h? They would > seem to cover this space moderately well. Yes, unfortunately they are not universally available. If you're using strictly GCC then that would work, but then again, you wouldn't need POSH either -- a lot of POSH's implicit goal is to make stuff portable cleanly to older versions of CodeWarrior and MSVC++. Brian ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Gamedevlists-general mailing list Gam...@li... https://lists.sourceforge.net/lists/listinfo/gamedevlists-general Archives: http://sourceforge.net/mailarchive/forum.php?forum_idU7 |
From: Brian H. <ho...@py...> - 2003-11-19 16:49:21
|
On Wed, 19 Nov 2003 09:05:54 -0500, J C Lawrence wrote: > You are familiar with glibC's inttypes.h and stdint.h? They would > seem to cover this space moderately well. Yes, unfortunately they are not universally available. If you're using strictly GCC then that would work, but then again, you wouldn't need POSH either -- a lot of POSH's implicit goal is to make stuff portable cleanly to older versions of CodeWarrior and MSVC++. Brian |
From: <ma...@ch...> - 2003-11-19 14:33:19
|
On Wed, 19 Nov 2003 ma...@ch... wrote: > On Tue, 18 Nov 2003, Garett Bass wrote: > > > Also, I find it misleading to call an unsigned char a "byte", since all of > > the char types are technically a byte > > Is it? Isn't the only requirement for a char, that it can contain a > "char", that is a single character from the platforms preferred character > set? Sorry, that was wrong. I confused byte with octet. Mads -- Mads Bondo Dydensborg. ma...@ch... If you aim the gun at your foot and pull the trigger, it's UNIX's job to ensure reliable delivery of the bullet to where you aimed the gun (in this case, Mr. Foot). - Terry Lambert, FreeBSD-Hackers mailing list. |
From: J C L. <cl...@ka...> - 2003-11-19 14:05:56
|
On Tue, 18 Nov 2003 13:07:29 -0500 Brian Hook <ho...@py...> wrote: > I've been working on a portable open source "harness", a header file > that (ideally) handles a lot of configuration crap for most people. > You know what I mean, the standard stuff like defining the proper DLL > export signature, sized types, etc. I'm still dismayed every time I > see some open source library doing: > #if WHATEVER typedef unsigned int my_u32_t; typedef long long > my_i64_t; #endif You are familiar with glibC's inttypes.h and stdint.h? They would seem to cover this space moderately well. -- J C Lawrence ---------(*) Satan, oscillate my metallic sonatas. cl...@ka... He lived as a devil, eh? http://www.kanga.nu/~claw/ Evil is a name of a foeman, as I live. |