Thread: RE: [GD-General] Is it possible to detect debugging?
Brought to you by:
vexxed72
From: Mat N. \(BUNGIE\) <mat...@mi...> - 2003-10-09 21:35:32
|
No. WinDBG and kd can attach in noninvasive mode and generate a full minidump, even if the process has attached to itself as a debugger. MSN -----Original Message----- From: gam...@li... [mailto:gam...@li...] On Behalf Of Colin Fahey Sent: Thursday, October 09, 2003 9:59 AM To: gam...@li... Subject: [GD-General] Is it possible to detect debugging? Can a Windows application determine if it is being debugged? Naturally it can't do anything if it has been suspended at a breakpoint, but I'm wondering if there is a way, during execution, to check for debugging. Sorry, it has been a while since I was in to Windows process information stuff, so I forget all the cool things a process can do. Anyhow, let's say the application is started. Meanwhile, something like Visual C++ has already been started, or is now started. Some time=20 during the execution of the application, the user of Visual C++ uses=20 the "Attach to Process..." option, and selects the application. Is there any way for the application to detect that it is being debugged, assuming it has not been immediately paused? The method cannot depend on any particular debugger (like checking for=20 certain states of, say, Visual Studio). But I assume that many debuggers work the same way -- although the 1337 haX0r might run the whole app in=20 an 80x86 emulator... I don't really have any particular objective, like popping up a dialog to embarrass the first haX0r who attempts to crack my application. I'm just making conversation. Relax! Talk amongst yourselves! --- Colin cp...@ea... ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. SourceForge.net hosts over 70,000 Open Source Projects. See the people who have HELPED US provide better services: Click here: http://sourceforge.net/supporters.php _______________________________________________ Gamedevlists-general mailing list Gam...@li... https://lists.sourceforge.net/lists/listinfo/gamedevlists-general Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=3D557 |
From: Mat N. \(BUNGIE\) <mat...@mi...> - 2003-10-10 00:54:19
|
That won't stop me from running windbg in non-invasive mode and capturing a memory dump, then inspecting that. I've done it before. It's not that hard at all. MSN -----Original Message----- From: gam...@li... [mailto:gam...@li...] On Behalf Of Aaron Drew Sent: Thursday, October 09, 2003 5:09 PM To: gam...@li... Subject: Re: [GD-General] Is it possible to detect debugging? I don't remember too much about it but I do recall reading a while back an=20 ezine (phrack or similar) describing how to avoid debuggers (well, I think it=20 was actuall on how to stop crackers). They listed a few techniques such as: - Modifying your code during execution to confuse any debugger and/or third=20 party. - Kind of like the way executable packers work. - Small assembly language segments like "mov ax, 0x9090; jmp 0xfe" that do nothing important but then jump back into their data and execute it. - Checking for an int 3 handler (the one that gets called at breakpoints) or=20 replacing the int 3 handler with your own (with extreme care). I'm sure there were more but I forget them. - Aaroin On Fri, 10 Oct 2003 08:34 am, Colin Fahey wrote: > [1] IsDebuggerPresent() > > [2] [...] WinDBG and kd can attach in noninvasive mode and generate a full > minidump [...] > > As I suspected: Developers can check for obvious debugging (e.g., to handle > errors differently), but circumventing the check requires a very modest > effort (e.g., intercept IsDebuggerPresent() call, or use noninvasive > debuggers). Also, I suppose one could run the whole application within the > context of an 80x86 emulator, so that it isn't running on the CPU as native > instructions, in which case the state of the code is totally exposed and > there is no way in the world the app could possibly detect that it was > running in virtual reality! (Much as we can't tell that we're living in > the Matrix -- without pills, I mean.) > > --- Colin > > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > SourceForge.net hosts over 70,000 Open Source Projects. > See the people who have HELPED US provide better services: > Click here: http://sourceforge.net/supporters.php > _______________________________________________ > Gamedevlists-general mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-general > Archives: > http://sourceforge.net/mailarchive/forum.php?forum_id=3D557 --=20 - Aaron "Today's mighty oak is just yesterday's nut that held its ground." ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. SourceForge.net hosts over 70,000 Open Source Projects. See the people who have HELPED US provide better services: Click here: http://sourceforge.net/supporters.php _______________________________________________ Gamedevlists-general mailing list Gam...@li... https://lists.sourceforge.net/lists/listinfo/gamedevlists-general Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=3D557 |
From: Colin F. <cp...@ea...> - 2003-10-10 02:25:48
|
I am convinced that a binary executable will never know how it is being executed. >>> That won't stop me from running windbg in non-invasive mode and >>> capturing a memory dump, then inspecting that. I heard of "WinDbg" before (and, no, I don't mean that I heard about it from the previous post! Hee, hee!), but I've never used it. What do you mean by capturing and inspecting a memory dump? Are you simply referring to looking at a snapshot of all allocated memory buffers? The only application hacking I ever did was on the Commodore 64, when I loaded an application, and then simply dumped a memory image of RAM to a file on disk. Thus, restoring the file to memory would bring the application back to life, in the exact state of its execution. I guess it is like bringing a laptop out of hibernation, or using VMWare (or whatever the product is called), to restore various computer states. So, one crude application cracking method would be to store both the executable and stored memory buffers and somehow restore both in to a process. I'm sure it's easy to protect against this kind of crack! Is the benefit of checking memory that you can look at run-time data structures, like finding where the player's "gold pieces" or "health" value is stored? Or scanning for strings, 3D models, textures, network buffers, etc? As I mentioned earlier, I am just making conversation. If I was talking about hacking prevention, I would say that various components to a complete solution involve: (1) Making every binary very unique (custom-built per user per download from the Internet); (2) Subscription model, or at least the requirement of calling home on start-up; (3) Making the application algorithm complicated, so even a total understanding of actions at the assembly level don't give any hint about the high-level actions in the program. I think all three pieces are necessary to make the hacker's effort a nightmare that ends in depressing failure. I read in a recently published book the claim that even the elite hackers, despite their expertise in the low level operations of computers, often cannot comprehend intermediate algorithms in computer science. But I think the only way that fact could be put to good use in anti-cracking efforts would be to make the entire application an intermediate level computer science algorithm! Otherwise, the cracker just bypasses the crazy code and replaces it with a functional "equivalent" -- like just saying, "yes" to the question of whether or not the product key is valid! By the way: I hate the idea of turning all products in to rentals! That just sucks. Unless it's a true *service*, with ongoing value added, then a "leasing" model is just greedy exploitation of the market. I swear I don't mean to rant or bash any company in particular, but I am generally afraid of the exploitation that occurs when companies succeed in taking full control of distribution (cell phone carriers) or the target platform (consoles or the upcoming CPU/OS combination that renders a PC in to a console). I make these comments just to make sure it's clear that there's a difference between protecting a product and exploiting one's control over a channel. --- Colin |
From: Aaron D. <ri...@in...> - 2003-10-10 03:46:26
|
On Fri, 10 Oct 2003 10:53 am, Mat Noguchi (BUNGIE) wrote: > That won't stop me from running windbg in non-invasive mode and > capturing a memory dump, then inspecting that. > > I've done it before. It's not that hard at all. > True. But that isn't exactly debugging... Nor could you restore a memory snapshot and expect the application state and graphics card state to also magically return to an operational condition. Of course there is no way to stop a determined cracker/debugger but at least these mechanisms will provide some sort of a road block to make things harder. (Although some may just see it as a challenge..) I've had a few games with CD-copy protection (eg. The Sims) refuse to load because I was running SoftICE (A debugger that runs below the OS). I'm sure that I could have stepped through the 'debugger detection' routine and skipped it if I really wanted to but the hassle was enough of a deterant for me (I just rebooted without it...) and I guess a large majority of would-be crackers would just get stumped by your super-37173 crack detection and not be able to figure out which jump command to modify. ): |
From: Colin F. <cp...@ea...> - 2003-10-10 08:08:04
|
> [...] Nor could you restore a memory > snapshot and expect the application state and graphics card state to also > magically return to an operational condition. It worked on the Commodore 64 for the most part because a 64 KB memory image really was (almost) the complete state of the computer! On the PC, I guess it's one thing to take a snapshot of EVERYTHING (operating system, all applications) to be restored later, and a very different thing to take a snapshot of a single application. Even ignoring the problem of hardware devices having state (like graphics cards and sound cards), restoring the code and memory of an application would not actually claim any resources from the operating system (memory allocations, open file handles, sockets, semaphores, windows, various DirectX interfaces, properly loaded DLLs). I suppose one could attempt to record and play back requests for such resources after the memory image had been restored, but one would probably need to fix up random memory locations according to the ACTUAL acquired window handle and various other handles and values. Even if the original snapshot was taken when the application was JUST starting up, before initializing sound and graphics, it might be quite tricky. ...but, hey, that's how the nu-B haX0r bec0mes a PN3637! > I've had a few games with CD-copy protection (eg. The Sims) refuse to load > because I was running SoftICE (A debugger that runs below the OS). I'm sure > that I could have stepped through the 'debugger detection' routine and > skipped it if I really wanted to but the hassle was enough of a deterant for > me (I just rebooted without it...) and I guess a large majority of would-be > crackers would just get stumped by your super-37173 crack detection and not > be able to figure out which jump command to modify. ): Maybe 1,000 totally diverse checks will tucker out even the determined cracker. The saved game file for GTA3 seemed to have a simple CRC code in it -- and that was enough to make me give up on messing with it! I think it's definitely worthwhile to put in really simple checking, because it must greatly reduce the number of people with the patience and ability willing to continue hacking the application. Any effort beyond that depends on how long one wants to delay piracy or cheating. --- Colin |
From: <phi...@pl...> - 2003-10-10 17:26:21
|
> I guess a large majority of would-be > crackers would just get stumped by your super-37173 crack detection and not > be able to figure out which jump command to modify. ): ...and for the rest of them, it would just be a more interesting challenge. Doing this sort of thing on the spectrum (cracking speedlock, hand tracing the garbage code that actually set up the R register ready for an unmodifiable decryption loop) is what got me into machine code. Anyway, nuff nostalgia, some recquired reading: http://www.gamasutra.com/features/20011017/dodd_01.htm Cheers, Phil |
From: Colin F. <cp...@ea...> - 2003-10-10 23:23:21
|
> [...] Anyway, nuff nostalgia, some recquired reading: > > http://www.gamasutra.com/features/20011017/dodd_01.htm > > Cheers, > Phil Great link. I remember reading that article when it was published (October, 2001). I like the links to the cracker sites; crazy! > Doing this sort of thing on the spectrum (cracking speedlock, > hand tracing the garbage code that actually set up the R register > ready for an unmodifiable decryption loop) is what got me into > machine code. Say... You wouldn't lead a double-life, by day an upstanding member of the programming community -- but by night, 1337 haX0r!? Ah, the old flame! Lower the lights, fire up the classified equipment, and, Mwoohahahaaa, let the cracking begin! Anyhow, I love the confidence of the editors of the http://www.paradogs.com cracker site when they describe "The competition": IV - The competition: Paradox loves competition, because at the ends, it always win. :) Many groups challenged Paradox through time, and once more there is a famous Paradox slogan to resume it all: 'Mess with the Best, Die like the Rest!' Consequence of a too strong competition, many groups got killed. Some of the famous wars between Paradox and teams like Angels, Classic, The Company, Prodigy, Hoodlum, Rom Kids ended with the same results, and you never found any Paradox blood on the floor. Some were a little wiser, like Crystal. They listen to the saying 'if you can't beat them, join them', but not every group were offered such a chance. We must also name here TRSI, our friends, whose ranks supplied some great people to Paradox. If Paradox has the best team around, TRSI has the best friends around. We must also talk about Fairlight, the group that gave Paradox the best competition it ever faced, and survived it. The groups are on different machines now, the challenge that can only be found on consoles got Paradudes in love with these machines. But you can expect one day that Paradox will jump back on P.C. to give to their respected competitors the challenge they need to take that scene to new heights. Today, Paradox is still facing some tough competition. But be sure of this: ten years from now, this history page will be longer, Paradox will still be ruling, and today competition will be gone. |
From: <phi...@pl...> - 2003-10-10 23:46:41
|
> > Doing this sort of thing on the spectrum (cracking speedlock, > > hand tracing the garbage code that actually set up the R register > > ready for an unmodifiable decryption loop) is what got me into > > machine code. > Say... You wouldn't lead a double-life, by day an upstanding > member of the programming community -- but by night, 1337 haX0r!? > Ah, the old flame! Lower the lights, fire up the classified > equipment, and, Mwoohahahaaa, let the cracking begin! Ha! No, I quit cracking about the same time my voice broke. Well, apart from a couple of things on the Mac in the early 90s, but that was for personal use, honest guv. Hmm, IIRC, one of them was Pathways into Darkness, the second Bungie game. So it's possible I've attacked the work of someone on this list. (Who may or may not be happy to know I wasn't completely successfull ;) Ahh, I wonder if you can get MacNosy for OS/X? http://www.jasik.com/nosy.html Hmm, maybe not... Cheers, Phil |
From: Brian H. <ho...@py...> - 2003-11-18 18:07:29
|
I've been working on a portable open source "harness", a header file that (ideally) handles a lot of configuration crap for most people. You know what I mean, the standard stuff like defining the proper DLL export signature, sized types, etc. I'm still dismayed every time I see some open source library doing: #if WHATEVER typedef unsigned int my_u32_t; typedef long long my_i64_t; #endif POSHLIB isn't very well tested or reviewed. It consists of a header file and an optional source file: http://www.poshlib.org I'd appreciate it if anyone is bored and has some free time, if they could look at it and let me know their opinions or if they spot any obvious bugs. Thanks, Brian |
From: <cas...@ya...> - 2003-11-19 02:13:50
|
Hi Brian, I'm using your library with gcc, mingw, cygwin and msvc compilers on win32 and linux on x86 only. The only thing I missed was the compiler definitions for mingw: #if defined __MINGW32__ || defined FORCE_DOXYGEN # define POSH_OS_MINGW # defien POSH_OS_WIN32 # if !defined FORCE_DOXYGEN # define POSH_OS_STRING "MinGW" # endif #endif I should have reported that long time ago, sorry for that. As you see I define both WIN32 and MINGW, because mingw uses the win32 api, but also fixes some small issues like the missing snprintf (_snprintf), so I think it's a good idea to define both. Hope that helps, -- Ignacio Castaño cas...@ya... |
From: Brian H. <ho...@py...> - 2003-11-19 02:37:48
|
> I'm using your library with gcc, mingw, cygwin and msvc compilers > on win32 and linux on x86 only. Cool! > #if defined __MINGW32__ || defined FORCE_DOXYGEN > # define POSH_OS_MINGW > # defien POSH_OS_WIN32 > # if !defined FORCE_DOXYGEN > # define POSH_OS_STRING "MinGW" > # endif > #endif I've added this to posh.h and I'll be uploading a new version tonight. Oddly enough I had __MINGW32__ documented, I just never actually added the conditionals for it. Thanks! Brian |
From: <cas...@ya...> - 2003-11-19 02:25:10
|
Oh, and another thing. As you say in the documentation the FORCE_DOXYGEN hack is a bit ugly. I thinkg that it would have been cleaner to check it only once and define all the definitions at the same time, ie: #if FORCE_DOXYGEN #define POSH_OS_WIN32 #define POSH_OS_LINUX #define ... ... #endif In this way, you could easily add other definitions for other code processors. -- Ignacio Castaño cas...@ya... |
From: Brian H. <ho...@py...> - 2003-11-19 02:59:45
|
> As you say in the documentation the FORCE_DOXYGEN hack is a bit > ugly. Doxygen, while very useful, has some, uh, issues... >I think that it would have been cleaner to check it only > once and define all the definitions at the same time, ie: > > #if FORCE_DOXYGEN > #define POSH_OS_WIN32 > #define POSH_OS_LINUX > #define ... > .. > #endif My only complaint with that is that I have to duplicate the define in two different locations, so I worry that it gets updated in one spot but not the other. Currently there is only a single definition. Brian |
From: Garett B. <gt...@st...> - 2003-11-19 02:41:07
|
Brian, After looking at posh.h I'm a bit confused. You have defined a whole list of basic types (posh_..._t), signed and unsigned integers of 8, 16, and 32 bits respectively. I notice, however, the only platform you support that requires these typedefs is PalmOS, and then only for the 32-bit integers. This seems like overkill, especially considering the duplication of integer/signed types. This leaves me wondering whether some platforms define char as unsigned char while others define char as signed char. Why not simply define posh types based on the number of bits and leave the signed/unsigned up to the user? For example: typdef char my_byte; void Foo(unsigned my_byte bar) {/*...*/}; Also, I find it misleading to call an unsigned char a "byte", since all of the char types are technically a byte and your posh "byte" has no intrinsic indication of being unsigned. Also, having the same signed type appear under two different names may leave the user wondering whether there is a need to explicitly convert a variable of type i16 to s16, which serves no purpose. Grandmaster B, I seek enlightenment on these matters, would you mind explaining your reasoning? Regards, Garett Bass gt...@st... -----Original Message----- From: gam...@li... [mailto:gam...@li...]On Behalf Of Brian Hook Sent: Tuesday, November 18, 2003 12:07 PM To: gam...@li... Subject: [GD-General] Feedback wanted on POSH I've been working on a portable open source "harness", a header file that (ideally) handles a lot of configuration crap for most people. You know what I mean, the standard stuff like defining the proper DLL export signature, sized types, etc. I'm still dismayed every time I see some open source library doing: #if WHATEVER typedef unsigned int my_u32_t; typedef long long my_i64_t; #endif POSHLIB isn't very well tested or reviewed. It consists of a header file and an optional source file: http://www.poshlib.org I'd appreciate it if anyone is bored and has some free time, if they could look at it and let me know their opinions or if they spot any obvious bugs. Thanks, Brian |
From: Brian H. <ho...@py...> - 2003-11-19 03:34:41
|
> After looking at posh.h I'm a bit confused. You have defined a > whole list of basic types (posh_..._t), signed and unsigned > integers of 8, 16, and 32 bits respectively. I notice, however, > the only platform you support that requires these typedefs is > PalmOS, and then only for the 32-bit integers. Well, that and the different 64-bit definitions. But you're right, on today's machines, ILP32 is basically the de facto standard. But this is partly misleading, since I simply haven't added support for any 16-bit processors, something I'd like to see done at some point if possible. There are also a huge range of embedded systems that I'd like to see supported, such as the Renesas/Hitachi H8 series; real-mode DOS support; etc. I just haven't got around to it. I know, I'm a freak, but hey, I figure I'll be thorough. Now, I do draw the line at FAR/NEAR stuff, but if something is a true 16-bit platform (not the 20-bit segmented stuff of DOS), then adding support for it SHOULD be pretty straightforward. > This seems like overkill, especially considering the duplication of > integer/signed types. Fair enough, but it's localized and doesn't need to be used. And, more importantly, almost every portable open source code base I've seen does almost the exact same thing. Take a look at libpng. zlib, etc. and they'll often have similar types of defines but with different names. My expectation is that an application or a library that uses posh will actually just shadow the posh types, e.g. typedef posh_byte_t mylib_byte_t; > This leaves me wondering whether some > platforms define char as unsigned char while others define char as > signed char. This is implementation dependent, and also the source of lots of fun portability bugs: char ch =3D 0xFF; if ( ch =3D=3D 0xFF ) return; exit( 1 ); //WHY IS THIS REACHED SOMETIMES? > Also, I find it misleading to call an unsigned char a "byte", since While I agree a "byte" means "size", in most implementations I've seen bytes are generally considered to be 8-bit unsigned quantities. Because of this, I just tried to stick to this convention. > variable of type i16 to s16, which serves no purpose. Agreed, I mostly put both sXX and iXX because different people have different preferences on this. I didn't want to dictate, but this may be a case where conciseness should overrule generality. Basically I couldn't choose which I liked better, so I did both =3D) Maybe I should just put in a comment that explicitly states that they are aliases of each other? > Grandmaster B, I seek enlightenment on these matters, would you > mind explaining your reasoning? So you wanna go there? =3D) Brian |
From: <ma...@ch...> - 2003-11-19 08:38:45
|
On Tue, 18 Nov 2003, Garett Bass wrote: > Also, I find it misleading to call an unsigned char a "byte", since all of > the char types are technically a byte Is it? Isn't the only requirement for a char, that it can contain a "char", that is a single character from the platforms preferred character set? Mads -- Mads Bondo Dydensborg. ma...@ch... Microsoft prints their own money. Head of MS, Bill Gates, has promised to give away $550000 worth of software to the Peru state, after Peru had debated making Open Source software mandatory in the Peru state administration. Nice photo: http://www.microsoft.com/presspass/images/misc/07-15perueducation_l.jpg - 20020717 |
From: Aaron H. <vid...@sh...> - 2003-11-19 10:16:19
|
If I remember correctly, I believe the PS2's gcc version at the time (I don't recall the exact version) compiled a char as unsigned 8bit, while visual studio was forced to interpret a char as unsigned 8bit. Perhaps the other way around, signed 8bit, but you get the idea. It was one of those mystery bugs that appeared for one day while configuring compile options for xbox vs.net. So explicitly typecasting for goofy types, like char and int, is a good idea. Also, beware that: int - is fastest performing integer type for the platform. So, Opteron, G5/PPC970, and Itanium may likely nudge int from 32bits up to 64bits. char - is the most appropriate character containing data type. So unicode char is 16bit unsigned, old-dos-days was 8bit signed, *nix was 8bit unsigned (?). float - fastest IEEE floating point arithmetic mode. So most everything was 32bits.. now we're migrating to 64bit, so this could change as well. void* - most appropriate pointer type. So dos's void* was 16bit for tiny and small real mode, 32bit for medium and large mode, then protected mode/flat mode rolled around and 32bit (31bit) became norm, now we've been warned not to use the 32nd bit for house keeping because that'll get busted when migrating to 64bit. As far as I know, the only truly standard types are: short - 16bit signed integer long - 32bit signed integer single - 32bit float double - 64bit float (but can be overridden to 80bit float on x86) Unfortunately, we're dealing with a really messy world. Hopefully type definitions have pretty well settled down. When the platform capabilities present themselves, I hope that the C/C++ type definitions will stick. New variations of type definitions could be used instead with the introduction of a compiler flag for friendly names like uint8, uint16, uint32, uint64, int8, int16, int32, int64, float16, float32, float64, float80, float128, vector2int16, vector3int16, vector4int32, vector2float16, etc.. you get the idea. However, that still leaves us with cross platform issues, which can only be solved by type-defining our own stuff. Eventually we'll have to gracefully move C/C++ into the new world of SIMD capabilities. I just hope there is also a plan to employ a much better set of type definitions to go with it. - Aaron. On 19-Nov-03, at 12:38 AM, ma...@ch... wrote: > On Tue, 18 Nov 2003, Garett Bass wrote: > >> Also, I find it misleading to call an unsigned char a "byte", since >> all of >> the char types are technically a byte > > Is it? Isn't the only requirement for a char, that it can contain a > "char", that is a single character from the platforms preferred > character > set? > > Mads > -- Aaron Hilton Software Developer Adaptive Optics Research University of Victoria |
From: Brian H. <ho...@py...> - 2003-11-19 11:43:23
|
> int - is fastest performing integer type for the platform. So, > Opteron, G5/PPC970, and Itanium may likely nudge int from 32bits up > to 64bits. This isn't really a platform thing as much as it is a compiler thing. Many, if not most, C compilers on DEC Alpha made "int" 32-bits simply for compatibility reasons. > float - fastest IEEE floating point arithmetic mode. > So most everything was 32bits.. now we're migrating to 64bit, so > this could change as well. Very few CPUs implement 64-bit floating point as fast as 32-bit. One thing that POSH doesn't try to do is abstract between IEEE and non-IEEE floating point formats. That just gets too complicated. > void* - most appropriate pointer type. > So dos's void* was 16bit for tiny and small real mode, 32bit for > medium and large mode, then protected mode/flat mode rolled around > and 32bit (31bit) became norm, now we've been warned not to use the > 32nd bit for house keeping because that'll get busted when > migrating to 64bit. This is one reason I'm specifically not supporting segmented architectures =3D) And technically DOS had 32-bit sized pointers, but they were actually only 20-bit addresses, much like early Amiga had 32-bit pointers but only 24-bit addresses (the good old days of shoving extra data in the high 8-bits...) > As far as I know, the only truly standard types are: > short - 16bit signed integer > long - 32bit signed integer nope The only guarantee is: sizeof(char)<=3Dsizeof(short)<=3Dsizeof(int)<=3Dsizeof(long) For example, on the Cray T8, ALL types are 64-bits except for char I believe. > Eventually we'll have to gracefully move C/C++ into the new world > of SIMD capabilities. I just hope there is also a plan to employ a > much better set of type definitions to go with it. Until the SIMD world can figure out how to refactor itself cleanly, I probably won't touch this, but it's something at the back of my mind. Right now "portable SIMD code" seems a bit like an oxymoron. Brian |
From: Paul R. <pa...@so...> - 2003-11-19 17:17:23
|
Actually, the signedness of a "char" can be changed with a flag: GCC: -funsigned-char VC: /J So you can't always assume that when you see "char foo;" that foo is a signed value. -----Original Message----- From: gam...@li... [mailto:gam...@li...]On Behalf Of Aaron Hilton Sent: Wednesday, November 19, 2003 2:16 AM To: gam...@li... Subject: Re: [GD-General] Feedback wanted on POSH If I remember correctly, I believe the PS2's gcc version at the time (I don't recall the exact version) compiled a char as unsigned 8bit, while visual studio was forced to interpret a char as unsigned 8bit. Perhaps the other way around, signed 8bit, but you get the idea. It was one of those mystery bugs that appeared for one day while configuring compile options for xbox vs.net. So explicitly typecasting for goofy types, like char and int, is a good idea. Also, beware that: int - is fastest performing integer type for the platform. So, Opteron, G5/PPC970, and Itanium may likely nudge int from 32bits up to 64bits. char - is the most appropriate character containing data type. So unicode char is 16bit unsigned, old-dos-days was 8bit signed, *nix was 8bit unsigned (?). float - fastest IEEE floating point arithmetic mode. So most everything was 32bits.. now we're migrating to 64bit, so this could change as well. void* - most appropriate pointer type. So dos's void* was 16bit for tiny and small real mode, 32bit for medium and large mode, then protected mode/flat mode rolled around and 32bit (31bit) became norm, now we've been warned not to use the 32nd bit for house keeping because that'll get busted when migrating to 64bit. As far as I know, the only truly standard types are: short - 16bit signed integer long - 32bit signed integer single - 32bit float double - 64bit float (but can be overridden to 80bit float on x86) Unfortunately, we're dealing with a really messy world. Hopefully type definitions have pretty well settled down. When the platform capabilities present themselves, I hope that the C/C++ type definitions will stick. New variations of type definitions could be used instead with the introduction of a compiler flag for friendly names like uint8, uint16, uint32, uint64, int8, int16, int32, int64, float16, float32, float64, float80, float128, vector2int16, vector3int16, vector4int32, vector2float16, etc.. you get the idea. However, that still leaves us with cross platform issues, which can only be solved by type-defining our own stuff. Eventually we'll have to gracefully move C/C++ into the new world of SIMD capabilities. I just hope there is also a plan to employ a much better set of type definitions to go with it. - Aaron. On 19-Nov-03, at 12:38 AM, ma...@ch... wrote: > On Tue, 18 Nov 2003, Garett Bass wrote: > >> Also, I find it misleading to call an unsigned char a "byte", since >> all of >> the char types are technically a byte > > Is it? Isn't the only requirement for a char, that it can contain a > "char", that is a single character from the platforms preferred > character > set? > > Mads > -- Aaron Hilton Software Developer Adaptive Optics Research University of Victoria ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Gamedevlists-general mailing list Gam...@li... https://lists.sourceforge.net/lists/listinfo/gamedevlists-general Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=557 |
From: Crosbie F. <cr...@cy...> - 2003-11-19 11:21:44
|
There are two requirements for types I reckon: 1) Size determined requirements with type 2) Policy determined requirements with type So there are things like "I'm developing a data structure with a need for 8 bits of unsigned integer" AND "I need the most efficient data type for expressing a boolean state" or "Best for representing a textual character" or "Compile time choice of floating point number, tradeoff size vs precision vs performance" So there are the sized types: typedef signed __int8 int8; // signed 8 bit integer typedef signed __int16 int16; // signed 16 bit integer typedef signed __int32 int32; // signed 32 bit integer typedef signed __int64 int64; // signed 64 bit integer typedef unsigned __int8 uint8; // unsigned 8 bit integer typedef unsigned __int16 uint16; // unsigned 16 bit integer typedef unsigned __int32 uint32; // unsigned 32 bit integer typedef unsigned __int64 uint64; // unsigned 64 bit integer typedef float float32; // 32 bit floating point typedef double float64; // 64 bit floating point typedef uint8 char_utf8; // 8 bit Unicode char UTF-8 typedef uint16 char_utf16; // 16 bit Unicode UTF-16 etc. typedef uint8 char_ascii; // NB not char in order to maintain consistent signing typedef uint16 char_ucs2; // 16 bit unicode UCS-2 etc. typedef uint8 void8; // Does this make sense? Don't care what type, but I need 8 bits And perhaps other sized types. Note I have used names that should be as obvious as possible to uninformed readers. Hackers can always typedef uint64 as u8 or something (unsigned int where sizeof=8 ). The policy determined types would be things like this: typedef int integer; // If you need an unsized integer use this, also needs range info typedef double real; // If you need an unsized floating point number use this, etc. typedef char_utf8 character; // unsized textual character, etc. typedef bool boolean; |
From: Crosbie F. <cr...@cy...> - 2003-11-19 11:38:04
|
Hmmn maybe the chars should be like this: typedef uint8 char8; // 8 bit char (no format info) typedef uint16 char16; // 16 bit char (no format info) > etc. And the Unicode stuff happen in the policy section, e.g. typedef char8 char_ascii; // Unsized char able to contain 7bit ASCII typedef char8 char_utf8; // Unsized char able to contain characters in the UTF-8 format typedef char16 char_ucs2; // Unsized char able to contain characters in the UCS-2 format typedef char_utf8 char_unicode; // Unsized char suitable for Unicode strings typedef char_unicode character; // Unsized char suitable for any text purpose |
From: Garett B. <gt...@st...> - 2003-11-19 17:12:35
|
// Crosbie Fitch wrote: // Hmmn maybe the chars should be like this: You will notice that POSH doesn't provide a char typedef, presumably because sizeof(char) == 1 in ANSI C, as mentioned in another post. I imagine that defining your own integer character type will require an explicit cast anytime you want to use a string manipulation function, which seems a little awkward. Of course, if you use C++ and STL, then you can always create a std::basic_string<char_utf8>, or whatever. // typedef char8 char_ascii; // Unsized char able to contain 7bit ASCII // typedef char8 char_utf8; // Unsized char able to contain... // typedef char16 char_ucs2; // Unsized char able to contain... I'm not sure I understand what you mean by "Unsized" here. If you're defining char8 to be uint8, then its size is 8 bits. // typedef char_utf8 char_unicode; // Unsized char suitable for Unicode // typedef char_unicode character; // Unsized char suitable for any text Not being too familiar with unicode, I find this confusing. I thought that "Unicode" was a multibyte format with no set number of bytes per character, ie. a single asian character may be represented by four bytes while the subsequent character is represented by two. Regards, Garett |
From: Paul R. <pa...@so...> - 2003-11-19 17:30:39
|
This is a pretty good overview of text encoding*: http://www.joelonsoftware.com/articles/Unicode.html I'd say everyone working on a shipping game should really evaluate if raw char* strings are really a good idea. If you've ever had to localize a 7-bit ascii game, you'll know what I'm talking about. Other software industries have been embracing unicode for quite some time. * - For the record, I'm not a Joel Spolsky fanboy. I can usually take him or leave him. ;o) -----Original Message----- From: gam...@li... [mailto:gam...@li...]On Behalf Of Garett Bass Sent: Wednesday, November 19, 2003 9:13 AM To: gam...@li... Subject: RE: [GD-General] Feedback wanted on POSH // Crosbie Fitch wrote: // Hmmn maybe the chars should be like this: You will notice that POSH doesn't provide a char typedef, presumably because sizeof(char) == 1 in ANSI C, as mentioned in another post. I imagine that defining your own integer character type will require an explicit cast anytime you want to use a string manipulation function, which seems a little awkward. Of course, if you use C++ and STL, then you can always create a std::basic_string<char_utf8>, or whatever. // typedef char8 char_ascii; // Unsized char able to contain 7bit ASCII // typedef char8 char_utf8; // Unsized char able to contain... // typedef char16 char_ucs2; // Unsized char able to contain... I'm not sure I understand what you mean by "Unsized" here. If you're defining char8 to be uint8, then its size is 8 bits. // typedef char_utf8 char_unicode; // Unsized char suitable for Unicode // typedef char_unicode character; // Unsized char suitable for any text Not being too familiar with unicode, I find this confusing. I thought that "Unicode" was a multibyte format with no set number of bytes per character, ie. a single asian character may be represented by four bytes while the subsequent character is represented by two. Regards, Garett ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Gamedevlists-general mailing list Gam...@li... https://lists.sourceforge.net/lists/listinfo/gamedevlists-general Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=557 |
From: Garett B. <gt...@st...> - 2003-11-19 17:57:37
|
Paul, It was after reading Joel's article that I understood Unicode to use an indeterminate number of bytes per character. Specifically: "In UTF-8, every code point from 0-127 is stored in a single byte. Only code points 128 and above are stored using 2, 3, in fact, up to 6 bytes." Which leaves me wondering, how do you figure out where one character ends and the next begins? Thanks in advance, Garett -----Original Message----- From: gam...@li... [mailto:gam...@li...]On Behalf Of Paul Reynolds Sent: Wednesday, November 19, 2003 11:31 AM To: gam...@li... Subject: RE: [GD-General] Feedback wanted on POSH This is a pretty good overview of text encoding*: http://www.joelonsoftware.com/articles/Unicode.html I'd say everyone working on a shipping game should really evaluate if raw char* strings are really a good idea. If you've ever had to localize a 7-bit ascii game, you'll know what I'm talking about. Other software industries have been embracing unicode for quite some time. * - For the record, I'm not a Joel Spolsky fanboy. I can usually take him or leave him. ;o) -----Original Message----- From: gam...@li... [mailto:gam...@li...]On Behalf Of Garett Bass Sent: Wednesday, November 19, 2003 9:13 AM To: gam...@li... Subject: RE: [GD-General] Feedback wanted on POSH // Crosbie Fitch wrote: // Hmmn maybe the chars should be like this: You will notice that POSH doesn't provide a char typedef, presumably because sizeof(char) == 1 in ANSI C, as mentioned in another post. I imagine that defining your own integer character type will require an explicit cast anytime you want to use a string manipulation function, which seems a little awkward. Of course, if you use C++ and STL, then you can always create a std::basic_string<char_utf8>, or whatever. // typedef char8 char_ascii; // Unsized char able to contain 7bit ASCII // typedef char8 char_utf8; // Unsized char able to contain... // typedef char16 char_ucs2; // Unsized char able to contain... I'm not sure I understand what you mean by "Unsized" here. If you're defining char8 to be uint8, then its size is 8 bits. // typedef char_utf8 char_unicode; // Unsized char suitable for Unicode // typedef char_unicode character; // Unsized char suitable for any text Not being too familiar with unicode, I find this confusing. I thought that "Unicode" was a multibyte format with no set number of bytes per character, ie. a single asian character may be represented by four bytes while the subsequent character is represented by two. Regards, Garett ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Gamedevlists-general mailing list Gam...@li... https://lists.sourceforge.net/lists/listinfo/gamedevlists-general Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=557 ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. Does SourceForge.net help you be more productive? Does it help you create better code? SHARE THE LOVE, and help us help YOU! Click Here: http://sourceforge.net/donate/ _______________________________________________ Gamedevlists-general mailing list Gam...@li... https://lists.sourceforge.net/lists/listinfo/gamedevlists-general Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=557 |
From: Nicolas R. <nic...@fr...> - 2003-11-19 18:58:47
|
Hmmmm, Looks like you misunderstood something... There are three ways of storing strings: - SBCS: "Single Byte Character Sets", using only 8-bits character encoding. That's the easiest one... Note that many kind of SBCS are available and they are only compatible on the 0-127 part. - DBCS: "Double Byte Character Sets", using 16-bits character encoding. UNICODE is one of those... - MBCS: "Multi Byte Character Sets", using a variable number of characters depending on the first one. That's exactly the kind of things that drives me nuts: inventing a stupid thing for badly engineered older things to continue working. But hey, that's life. There, you cannot tell the size of a character, however, the system is providing you with functions for that. Basically you are ALWAYS pointing to the first byte of the character (otherwise everything is broken). Given that byte, you can tell the size of the character (mbclen or something like that), incrementing the pointer will then give you the next character. Last char is 0. Note that it is IMPOSSIBLE to go backward unless you know the string first character address. Note also that it is the way Windows is doing the UI/File-system. So basically: // length (number of characters) of a string: unsigned int _strlen(const char* mbstr) { unsigned int ret = 0; while (*mbstr) { ++ret; mbstr += mbclen(mbstr); } return ret; } // size (in bytes) of a string (not including ending null char): unsigned int _strsize(const char* mbstr) { unsigned int ret = 0; unsigned int t; while (*mbstr) { t = mbclen(mbstr); ret += t; mbstr += t; } return ret; } > -----Original Message----- > From: gam...@li... > [mailto:gam...@li...] On > Behalf Of Garett Bass > Sent: Wednesday, November 19, 2003 6:58 PM > To: gam...@li... > Subject: RE: [GD-General] Unicode > > > Paul, > > It was after reading Joel's article that I understood > Unicode to use an indeterminate number of bytes per > character. Specifically: > > "In UTF-8, every code point from 0-127 is stored in a single > byte. Only code points 128 and above are stored using 2, 3, > in fact, up to 6 bytes." > > Which leaves me wondering, how do you figure out where one > character ends and the next begins? > > Thanks in advance, > Garett > > > -----Original Message----- > From: gam...@li... > [mailto:gam...@li...]On > Behalf Of Paul Reynolds > Sent: Wednesday, November 19, 2003 11:31 AM > To: gam...@li... > Subject: RE: [GD-General] Feedback wanted on POSH > > > This is a pretty good overview of text encoding*: > http://www.joelonsoftware.com/articles/Unicode.html > > I'd say everyone working on a shipping game should really > evaluate if raw > char* strings are really a good idea. If you've ever had to > localize a 7-bit ascii game, you'll know what I'm talking > about. Other software industries have been embracing unicode > for quite some time. > > * - For the record, I'm not a Joel Spolsky fanboy. I can > usually take him or leave him. ;o) > > -----Original Message----- > From: gam...@li... > [mailto:gam...@li...]On > Behalf Of Garett Bass > Sent: Wednesday, November 19, 2003 9:13 AM > To: gam...@li... > Subject: RE: [GD-General] Feedback wanted on POSH > > > // Crosbie Fitch wrote: > // Hmmn maybe the chars should be like this: > > You will notice that POSH doesn't provide a char typedef, > presumably because > sizeof(char) == 1 in ANSI C, as mentioned in another post. I > imagine that defining your own integer character type will > require an explicit cast anytime you want to use a string > manipulation function, which seems a little awkward. Of > course, if you use C++ and STL, then you can always create a > std::basic_string<char_utf8>, or whatever. > > // typedef char8 char_ascii; // Unsized char able to contain > 7bit ASCII // typedef char8 char_utf8; // Unsized char able > to contain... // typedef char16 char_ucs2; // Unsized char > able to contain... > > I'm not sure I understand what you mean by "Unsized" here. > If you're defining char8 to be uint8, then its size is 8 bits. > > // typedef char_utf8 char_unicode; // Unsized char suitable > for Unicode // typedef char_unicode character; // Unsized > char suitable for any text > > Not being too familiar with unicode, I find this confusing. > I thought that "Unicode" was a multibyte format with no set > number of bytes per character, ie. a single asian character > may be represented by four bytes while the subsequent > character is represented by two. > > Regards, > Garett > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us > help YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Gamedevlists-general mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-general > Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=557 > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us > help YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Gamedevlists-general mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-general > Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=557 > > > > ------------------------------------------------------- > This SF.net email is sponsored by: SF.net Giveback Program. > Does SourceForge.net help you be more productive? Does it > help you create better code? SHARE THE LOVE, and help us > help YOU! Click Here: http://sourceforge.net/donate/ > _______________________________________________ > Gamedevlists-general mailing list > Gam...@li... > https://lists.sourceforge.net/lists/listinfo/gamedevlists-general > Archives: http://sourceforge.net/mailarchive/forum.php?forum_id=557 > |