Thread: [GD-General] (Really) XML for portable preferences library
Brought to you by:
vexxed72
From: Jeff <je...@sp...> - 2003-12-16 21:57:33
|
Me: >> I'm surprised that no-one has said "use XML format please" - its >> not that painful, given that you'll have a fairly limited schema >> to implement. Brian: > So here's a question -- I know that whenever a discussion about text > file formats comes up there is often a "use XML" grenade tossed into > it. And didn't it raise a firestorm this time, with an inordinate number of people who didn't seem to be able to understand that a file format for a PREFERENCES system need not have the same requirements as those for a 3D LEVEL loader... Brian: > Now, I can think of a lot of reasons NOT to use XML (i.e. it's > overkill, for one), but I'm curious what real, tangible reasons for > using XML exist, as opposed to a simple application specific structure > (like INI files). Me: (answering Brian's question) A little has been said about the viewability of preference files in apps like IE, and this is certainly an advantage. XML also has standard(?) rules about encoding that let you deal with how you'll store <> characters - your .INI needs similar rules for escape characters (or quotes, or whatever...) What I haven't seen in any of the discussion is anyone pointing out that XML handles STRUCTURE very nicely. At this point, your preferences library lets the user write out integers and strings - however, its fairly straight-forward to write out OBJECTS (or structs if you prefer). Inhouse we use expat and have a simple shim wrapped around it that knows how to write our favorite data structures, the most complex of which is a list which contain lists (ad infinitum) which can contain ints, reals, strings, points, lines, etc. XML handles the nesting very nicely - if we used .INI files we'd need to reinvent XML (or use some hideous key-naming scheme that appended digits every time we went down a level) Whilst you mightn't advocate points and lines for most peoples preferences, think "saved window positions". You might do: <WINDOW name="toolpalette"> <POS X=123 Y=456 /> <SIZE H=100 W=200 /> </WINDOW> As a bonus, if you use a more specific tag than POS, say WINDOWPOS, then your XML loader (as opposed to your application) can even be smart enough to clip its values based on the current screen size. It means that you've encoded more intelligence into the basic data types that you are storing, this is true, but then you don't need to use them if you don't want to. My main objection to XML, or perhaps its just expat's parser, is the processing of whitespace. Expat calls you back each time it processes a LINE, which means that any data types that get split over a line (perhaps by someone's favorite XML editor) mean you need to implement your own buffering to tie things back together. Since the four lines of my email signature are a preference, its important to me that any preference system handle multi-line strings correctly. And finally, <MARKETTING> If you use XML, you get to say "... and it uses industry-standard XML files" and you get to watch the industry beat a path to your door because your product is *obviously superior*. </MARKETTING> ;-) Jeff Laing <je...@sp...> ------------------------------------------------------------------------------- "Meetings. Don't we love meetings? Every day. Twice a day. We talk." He got on one elbow. "I bet if I blew the conch this minute, they'd come running. Then we'd be, you know, very solemn, and someone would say we ought to build a jet, or a submarine or a TV set." -- William Golding, "Lord of the Flies" |
From: Alen L. <ale...@cr...> - 2003-12-17 07:47:46
|
From: "Jeff" <je...@sp...> > And didn't it raise a firestorm this time, with an inordinate number of > people who didn't seem to be able to understand that a file format for a > PREFERENCES system need not have the same requirements as those for a 3D > LEVEL loader... It was me who asked what people have to say on using XML for everything, so this is not going in the wrong direction, it merely forked. It is hardly noticeable, but in the mean time, the subthread got " but has nothing to do with them anymore" suffix in the subject. ;) Alen |
From: Nicolas R. <nic...@fr...> - 2003-12-17 08:53:56
|
Hi, Just re-read the entire thread, but its getting too long (at least for me) and couldn't find any suitable answer to XML pros and cons. As already stated, XML is just a way of storing things. Just like some kind of "dynamically extensible binary format" in the text form. There is one word here which is (in my sense) the most important one (at least for preferences ! :) ): EXTENSIBLE. It means that you can add/substract any kind of data/command/tag without having to rewrite entirely everything. Let's take an example: You have the following memory data: Struct C { ...Some data ... }; Struct B { ...Some more data... C m_c; }; Struct A { ...Still data... B m_b; C m_c; }; Now you want to read/write several A entries from a persistant location: - simple "raw" file: pros: fastest possible. cons: breaks as soon as the C or B structure has changed (having to "patch" all files contains some A data as well, but how one would know ? => you need some kind of "big-data-boss" knowing what needs to be "re-saved"). - version tagged file: you write some kind of "version" value for each structure, each structure has its own loading/saving functions (usually virtual generic load/save in C++). Loading depends on version read, saving always saving latest version. Pros: fast. Backward compatibility. Cons: Becomes a real mess when you have 6/7 versions around to support. No backward compatibility (file v2.1 cant be read by v2.0). - property tagged file: You write each data entry with its own "tag" (basically a "property name", a "property type" and its value). Unknown tags are discarded. Pros: Almost same speed as the version tagged file. Full backward/upward compatibility. Cons: nx times slower than the "raw" reading. Here comes the XML: XML provides a completely "normalized" way of doing solution 3 in a portable, easily editable way (nice for debugging). We have completely dropped solution 2. Solution 1 is kept for large binary intrasic formats (textures, mesh-data, ...). For preferences this is a good choice since it must be fully upward compatible (I simply hate those programs that reinitialize preferences at update), fully backward compatible (I can send my preferences to a friend who is not having the exact same version), and editable by anyone (don't you just hate when you select a crashing config, reloaded each time you launch the game, making impossible to change it back, without reinstalling the game... Saw one such game earlier this month !). Preferences don't need to be that fast (for us, it appears that time spent in CreateFile is bigger than in any other parts of the code during preference loading). We did prefer another choice of script-form preferences (mainly for code reusage reasons) ... But hey, doing: Rendering-prefs { api = OpenGL; width = 1600; height = 1200; } Or <RENDERING> <PROP name="api">OpenGL</PROP> <PROP name="width">1600</PROP> <PROP name="height">1200</PROP> </RENDERING> Is quite the same anyway ! :) Nicolas. |
From: Brett B. <res...@ga...> - 2003-12-17 09:17:29
|
Perhaps the correct question is not to debate the merits of XML, but = rather to ask "Who is _already_ using XML anyway?" If most of the = developers already use XML and are comfortable with it, a portable prefs = library that doesn't use xml will likley put people off simply because = they could whip up their own in xml. If most people are _not_ using xml = then it seems clear a simple ini style is the only logical choice. We are already using xml and we would serisouly consider the portable = library if it used xml, we would likely not use it if it is ini based. Who else is using xml already and who isn't? Brett |
From: Brian H. <ho...@py...> - 2003-12-17 12:07:33
|
> the developers already use XML and are comfortable with it, a > portable prefs library that doesn't use xml will likley put people > off simply because they could whip up their own in xml. Ironically enough, the thing that originally started all this (portable prefs library) was intended to use system preferences and thus make the whole concept of XML/file formats rather moot. Under Windows, it would use the registry, and under OS X, it would use Preferences (which, doubly ironically, are in XML), and under Linux, some simple text file format (not XML, because then it starts getting rather large). Things have been inverted for the library quite a bit. Specifically, the preferences aspect is now decoupled from the properties aspect, and the pref lib is simply there to serialize/deserialize properties into a property list. Brian |
From: Brian H. <ho...@py...> - 2003-12-22 00:18:29
|
Not looking to start a religious war here, but outside of "it doesn't work on other operating systems", what were the complaints (if any) about the Mac-style file/creator type IDs? Did any other operating systems use these (NeXTStep?). Brian |
From: Brian H. <ho...@py...> - 2003-12-23 17:22:15
|
I've been doing a high-level analysis of different scripting languages, and I was surprised that, on the surface, JavaScript seems to have a lot in common with Lua: - typeless/loosely typed - fundamental type is table - prototype based - C-like syntax - garbage collected - relatively easy to embed Does anyone know of the major differences between the languages other than performance (JavaScript is much slower) and minor language differences like JavaScript's exception handling? Brian |
From: Brian H. <ho...@py...> - 2003-12-26 18:04:17
|
About a year and a half ago there was a fairly major brouhaha on the algorithms list about this line of code: int x =3D * ( int * ) &somefloat; Now, let's push aside endianess and size issues, the concern was that since there was "type-aliasing" that Something Bad could happen. Something Bad, of course, being a rather ambiguous statement. I'm aware of all the bad things that can happen if you have type-aliasing in conjunction with pointer aliasing, which is related, but that one line above doesn't seem like it should be bad _with a legal C compiler_. The major concern are optimizations that the compiler may make that affect order. For example: somefloat =3D 1.0f; x =3D * ( int * ) &somefloat; In theory, a heavily optimizing C compiler would see that the assignment to somefloat should not affect the assignment to x since they are incompatible types, which may allow it to decide to assign to somefloat _after_ the assignment to x. But that would be illegal. The C specification states that the end of every expression is a sequence point, and thus the assignment to somefloat MUST be flushed before any subsequent statements are executed. So, I can buy that the aliasing thing is a serious concern if there is concern about a C compiler aggressively optimizing and doing it incorrectly, but that doesn't seem to be the argument. Of course, granted, using a union makes more sense and is a bit cleaner, I'm fine with that: union { int i; float somefloat; } u; u.somefloat =3D 1.0f; x =3D u.i; But according to the C standard, the above is undefined ("If the value being stored in an object is accessed from another object that overlaps in any way the storage of the first object, then the overlap shall be exact and the two objects shall have qualified or unqualified versions of a compatible type; otherwise, the behavior is undefined."). Anyone have something more authoritative on this issue? Brian |
From: Eero P. <epa...@ko...> - 2003-12-26 19:43:12
|
Brian Hook wrote: > About a year and a half ago there was a fairly major brouhaha on the > algorithms list about this line of code: > > int x = * ( int * ) &somefloat; > > Now, let's push aside endianess and size issues, the concern was that > since there was "type-aliasing" that Something Bad could happen. > Something Bad, of course, being a rather ambiguous statement. > > I'm aware of all the bad things that can happen if you have > type-aliasing in conjunction with pointer aliasing, which is related, > but that one line above doesn't seem like it should be bad _with a > legal C compiler_. > > The major concern are optimizations that the compiler may make that > affect order. For example: > > somefloat = 1.0f; > x = * ( int * ) &somefloat; > > In theory, a heavily optimizing C compiler would see that the > assignment to somefloat should not affect the assignment to x since > they are incompatible types, which may allow it to decide to assign to > somefloat _after_ the assignment to x. > > But that would be illegal. The C specification states that the end of > every expression is a sequence point, and thus the assignment to > somefloat MUST be flushed before any subsequent statements are > executed. > I have understood sequence points quite differently... I see them as rules on what I should do, not as promises what the compiler will do. It is true that if I obey the rules the compiler will (hopefully) provide me the illusion that it is also doing the same. Still expecting that some value actually gets written to memory at a certain time based on the sequence points is IMHO incorrect. For actually getting values really written out you need a Voodoo priest, three (black) chickens and the volatile keyword. I think specifically the compiler according to the latest rules has the right to really fool with your code, because according to the rules the float and integer values cannot be related to each other anyways, so why would the compiler care about ordering. > Of course, granted, using a union makes more sense and is a bit > cleaner, I'm fine with that: > > union > { > int i; > float somefloat; > } u; > > u.somefloat = 1.0f; > x = u.i; > > But according to the C standard, the above is undefined ("If the value > being stored in an object is accessed from another object that > overlaps in any way the storage of the first object, then the overlap > shall be exact and the two objects shall have qualified or unqualified > versions of a compatible type; otherwise, the behavior is > undefined."). > I also pointed this at the algorithms list, apparently the union trick is something which is specifically ok with gcc, but I agree it is not a good general solution. I think there is a valid answer through the pointer manipulation though. I have heard that (someday I must purchase the standard instead of relying on hearsay) the new rules consider char * special so that the compiler will not do any aliasing optimisation around it. So converting your float variable address to char pointer, and constructing the integer through it should be the correct way. I just tried and gcc seems to stop the aliasing warning even if I only do this: float s=t; int x= *(int *)(char *)&s; return x; But I am certainly not sure if that is really enough. (Although both pointer conversions separately should be ok) > Anyone have something more authoritative on this issue? > Ooops, well you got my opinion at least... Eero |
From: Brian H. <ho...@py...> - 2003-12-26 20:00:53
|
> I think specifically the compiler according to the latest rules has > the right to really fool with your code, because according to the > rules the float and integer values cannot be related to each other > anyways, so why would the compiler care about ordering. That's my understanding as well. The confusion (and arguments) seem to stem from who believes the Final Authority is. Some argue that the standard is clear; others argue that compilers have more flexibility during optimization; etc. etc. > I also pointed this at the algorithms list, apparently the union > trick is something which is specifically ok with gcc, but I agree > it is not a good general solution. I think I read somewhere that if you access it the data through the union (i.e. not indirectly to a pointer to union member) that there will not be an aliasing problem, i.e.: union u { int x; float f; }; u a; a.f =3D 1.0f; y =3D a.x; The above should be fine (but again, the standard is not clear on this, at least that I could tell -- in fact, the phrase "type punning" only appears once in my copy of the draft C99 standard, and the substring "alias" only appears four times). That said, from 6.5.7: "An object shall have its stored value accessed only by an lvalue expression that has one of the following types: .. .. .. =97 an aggregate or union type that includes one of the aforementioned types [Ed: compatible types] among its members (including, recursively, a member of a subaggregate or contained union)" so: int y =3D u.x; Is obviously fine. But it's not clear how that relates if you've just stored to a.f. > I think there is a valid answer through the pointer manipulation > though. I have heard that (someday I must purchase the standard > instead of relying on hearsay) $18 from ansi.org. The C++ standard is $183. I think that sums up the differences between the two languages very succinctly =3D) > the new rules consider char * > special so that the compiler will not do any aliasing optimisation > around it. This has been around since the 1989 standard I believe, and I _think_ it has been extended void * as well. I _think_ that the following will obviate any aliasing concerns: int x =3D * ( int * ) ( ( void * ) &f ); Because a pointer of type "char *" or "void *" may have no assumptions made about aliasing. > So converting your float variable address to char pointer, and > constructing the integer through it should be the correct way. That is my belief as well. However I see people present the union trick more often, which concerns me. Brian |
From: Brian H. <ho...@py...> - 2003-12-26 20:51:36
|
On Fri, 26 Dec 2003 21:43:52 +0200, Eero Pajarre wrote: > Brian Hook wrote: > > >> About a year and a half ago there was a fairly major brouhaha on >> the algorithms list about this line of code: >> >> int x =3D * ( int * ) &somefloat; >> >> >> Now, let's push aside endianess and size issues, the concern was >> that since there was "type-aliasing" that Something Bad could >> happen. Something Bad, of course, being a rather ambiguous >> statement. >> >> I'm aware of all the bad things that can happen if you have type- >> aliasing in conjunction with pointer aliasing, which is related, >> but that one line above doesn't seem like it should be bad _with >> a legal C compiler_. >> >> The major concern are optimizations that the compiler may make >> that affect order. For example: >> >> somefloat =3D 1.0f; >> x =3D * ( int * ) &somefloat; >> >> >> In theory, a heavily optimizing C compiler would see that the >> assignment to somefloat should not affect the assignment to x >> since they are incompatible types, which may allow it to decide >> to assign to somefloat _after_ the assignment to x. >> >> But that would be illegal. The C specification states that the >> end of every expression is a sequence point, and thus the >> assignment to somefloat MUST be flushed before any subsequent >> statements are executed. >> > I have understood sequence points quite differently... > > > I see them as rules on what I should do, not as promises what the > compiler will do. >From 5.1.2.3.2: "Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects,10) which are changes in the state of the execution environment. Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place." That seems pretty clear to me, which is the basic fundamental rule as to why the whole aliasing thing has been somewhat puzzling to me. The language seems to state that: f =3D 1.0f; y =3D * ( int * ) &f; The only snafu is Appendix C's list of sequence points, where it says that a "full" expression is a sequence point, but I'm not quite sure if an assignment is considered a full expression (known full expressions are expressions used in a return; selection statement; function calls; and "the expression statement in an expression"). Brian |
From: Brian H. <ho...@py...> - 2003-12-26 21:01:20
|
> The only snafu is Appendix C's list of sequence points, where it > says that a "full" expression is a sequence point, but I'm not > quite sure if an assignment is considered a full expression (known > full expressions are expressions used in a return; selection > statement; function calls; and "the expression statement in an > expression"). Turns out it is. An assignment expression or function call is treated as a "void expression" which is evaluated specifically for its side effects and thus should count as a full expression and, by extension, as a sequence point. So if this chokes an optimizer: x =3D * ( int * ) &f; Due to an assignment that occurs on 'f' earlier not being done until after the assignment to 'x', then it is the optimizer's fault, and not the programmers. Not to say that a programmer shouldn't be aware of this stuff and deal with it properly, of course =3D) Brian |
From: Tom H. <to...@ve...> - 2003-12-26 22:10:50
|
At 01:01 PM 12/26/2003, you wrote: >Turns out it is. An assignment expression or function call is treated >as a "void expression" which is evaluated specifically for its side >effects and thus should count as a full expression and, by extension, >as a sequence point. > >So if this chokes an optimizer: > >x = * ( int * ) &f; > >Due to an assignment that occurs on 'f' earlier not being done until >after the assignment to 'x', then it is the optimizer's fault, and not >the programmers. Maybe someone else on the list can confirm what we've been talking about off-list. This problem only exists if the "Assume no Aliasing" compiler optimization is turned on. My understanding is that this optimization tells the compiler that it is ok to break the rules. If the programmer then uses aliasing after telling the compiler to assume he isn't, then I would call this a programmer error ;) The sticky part is, I believe this optimization is defaulted to on in most release build configurations, so it's one of those things that most programmers probably aren't aware of. In one of the many threads on this subject, it has been said that the C++ spec has something in it that allows for this optimization (specifically, the C++ spec says that the compiler can assume no aliasing). The end result would be that writing the above code in C++ is illegal with or without the optimization turned on. Anyone have a copy of the C++ spec that they can search and post relevant portions? >Not to say that a programmer shouldn't be aware of this stuff and deal >with it properly, of course =) Agreed. Tom |
From: Brian H. <ho...@py...> - 2003-12-26 23:05:14
|
> (specifically, the C++ spec says that the compiler can assume no > aliasing). The C spec implicitly states this as well, by itemizing the list of allowable aliases. Brian |
From: Brian H. <ho...@py...> - 2003-12-27 20:09:37
|
Oy, freakin' language lawyers. I posted to comp.std.c about this, and they followups basically focused on these two things: - reading a float back as an int is ILLEGAL!!! UNCLEAN, WE CAST THEE OUT!!! - I said "type alias" when the correct terminology is "type punning". I'm such a retard. Of course, no one actually answered the question. I restated, I'll see what nits they pick with my new question. *sigh* Brian |
From: Colin F. <cp...@ea...> - 2003-12-28 06:14:11
|
>>> Oy, freakin' language lawyers. I posted to comp.std.c about >>> this, and they followups basically focused on these two things: >>> >>> - reading a float back as an int is ILLEGAL!!! UNCLEAN, WE CAST >>> THEE OUT!!! >>> >>> - I said "type alias" when the correct terminology is >>> "type punning". I'm such a retard. In the End Days they will be judged as they have judged, and there will be weeping and gnashing of teeth. Yea, even the floats will be read back as ints in the vortex of chaos, where the C specification will be engulfed in a lake of fire until time itself is vaporized. Or maybe not. In any case, Google seems to have a lot of relevant links. (keywords: "type punning") I won't pretend to understand the subject, but these links looked interesting: http://www.talkaboutprogramming.com/group/comp.lang.c.moderated/messages/26015.html http://www.ethereal.com/lists/ethereal-dev/200309/msg00342.html http://wwwold.dkuug.dk/JTC1/SC22/WG14/www/docs/dr_283.htm Hee, hee! "clean up a bunch of type punning warnings" in Quake 2: http://www.quakeforge.net/cgi-bin/viewcvs.cgi/quake2/src/cmd.c http://zuul.quakeforge.net/list-archives/quake2-cvs/2003-July/000243.html |
From: Neil S. <ne...@r0...> - 2003-12-28 14:26:24
|
> Of course, no one actually answered the question. I > restated, I'll see what nits they pick with my new question. *sigh* You won't get a proper answer because they have nothing that will help you. The C and C++ standards are retarded in the sense that they have, increasingly over the years, tried to improve platform-independence by tightening up binary compatibility issues, which is a pretty fruitless exercise. If you really want binary compatibility, then you have to accept that the basic types will have to be the same on all platforms, and that's something they are not prepared to accept, as it will hurt performance on non-conforming systems. So you end up with a set of rules based on the notion that you cannot assume very much about anything at all, such as the relative alignment issues of floats and ints, or even crazy things like a 32 bit int not necessarily being able to represent the data in a 32 bit float. For example, the three 'issues' that were mentioned on comp.std.c *do not actually happen* on any implementations they know of. In other words, the law is an ass. If the standards committee wanted to be actually helpful, they would have specified exceptions to these rules, such as: when the two types have identical size and alignment, type punning is well-defined. This might reduce theoretical binary compatibility, but will not harm actual binary compatibility one bit, and will at least provide some rules which are compatible with what people actually do. The real insult is that idioms like *(int *)&f are commonly used, and generally handled as expected (by users not the standard) by most compilers. This is a case where the real standard is the standard which actually exists in practice and not the standard that some academics have made up. Incidentally, when someone starts asking you to define 'works', you know nothing good will come out of the discussion. ;) - Neil. |
From: Brian H. <ho...@py...> - 2003-12-28 18:02:43
|
> a pretty fruitless exercise. If you really want binary > compatibility, then you have to accept that the basic types will > have to be the same on all platforms, and that's something they are > not prepared to accept, as it will hurt performance on non- > conforming systems. Spot on. And I don't mind that a lot of operations are undefined or implementation defined, but what bothers me is the head-in-the-sand "we don't talk about these things" attitude. It is pretty much impossible to write a portable C program that does anything useful. > If the standards committee wanted to be actually helpful, they > would have specified exceptions to these rules, such as: when the > two types have identical size and alignment, type punning is well- > defined. Actually, if they had simply made it "implementation defined" with some caveats, that would have helped, because in practice this is all implementation defined and not truly "undefined". "Undefined" has connotations of causing your computer to explode. > The real insult is that idioms like *(int *)&f are commonly used, > and generally handled as expected (by users not the standard) by > most compilers. SHHHHHHHHHHHH! > This is a case where the real standard is the > standard which actually exists in practice and not the standard > that some academics have made up. Well, I wouldn't go that far, because in reality there are too many areas where what works in practice WILL explode on other systems. The x86 has made this an unfortunate problem. A common example is: struct foo { char b; /* assume tight packing */ int x; }; char buffer[ 1024 ]; struct foo *f =3D ( struct foo * ) buffer; int y; fread( fp, buffer, 1, sizeof( buffer ) ); y =3D f->x; /* misaligned access, fine on x86, crashes on, say, SPARC */ So I understand their rather dogmatic desire to make sure everyone follow the rules, but I don't like the attitude that "writing real software is unclean, even if you know what rules you're breaking". But I digress...the one thing I can definitely say I've taken away from this is that when someone yells at me for type-alias violation, I'll just turn around and say that code is undefined and therefore THEY'RE WRONG TOO, HAH! =3D) Anyway, near as I can tell, if you want the int bits, and you can ASSUME that sizeof( float ) =3D=3D sizeof( int ) and you can ASSUME a certain endianess, then the following should work no matter what: float f =3D 1.0f; int i =3D 0; unsigned char *c =3D ( unsigned char * ) &f; if ( sizeof( f ) ! =3D sizeof( i ) ) explode(); /* do some endianess verification as well */ .. .. .. i =3D ( c[ 0 ] << 24 ) | ( c[ 1 ] << 16 ) | ( c[ 2 ] << 8 ) | ( c[ 3 ] ); Brian |
From: Neil S. <ne...@r0...> - 2003-12-29 02:27:08
|
> Actually, if they had simply made it "implementation defined" > with some caveats, that would have helped, because in > practice this is all implementation defined and not truly > "undefined". "Undefined" has connotations of causing your > computer to explode. Thankfully, compiler vendors do generally define these things, although they might not always put them in the manual. Perhaps this has something to do with the fact that they have someone to answer to (their customers) if they don't provide useful functionality. > Well, I wouldn't go that far, because in reality there are > too many areas where what works in practice WILL explode on > other systems. The > x86 has made this an unfortunate problem. That's why I said "this is a case where", rather than making a blanket statement. ;) > So I understand their rather dogmatic desire to make sure > everyone follow the rules, but I don't like the attitude that > "writing real software is unclean, even if you know what > rules you're breaking". Thing is, they could make it clean if they had any interest at all in practical matters, but they don't, so I have no sympathy for them. > But I digress...the one thing I can definitely say I've taken > away from this is that when someone yells at me for > type-alias violation, I'll just turn around and say that code > is undefined and therefore THEY'RE WRONG TOO, HAH! =) Or you could ask them to come up with a way of doing the same thing cleanly, with no loss in performance, and see how far they get. ;) - Neil. |
From: Alen L. <ale...@cr...> - 2003-12-29 08:14:03
|
> Or you could ask them to come up with a way of doing the same thing cleanly, > with no loss in performance, and see how far they get. ;) No need to mention performance at all. Does anyone know of a way to test for infinity, nan and denormalized, or to recompress a float into different word size (float16 e.g.) - without reinterpreting the float as an int? Alen |
From: Pierre T. <pie...@no...> - 2003-12-29 08:39:34
|
> No need to mention performance at all. Does anyone know of a way to test for > infinity, nan and denormalized, or to recompress a float into different word > size (float16 e.g.) - without reinterpreting the float as an int? Yep: int _fpclass(double); BTW I tend to write this: int x = ( int & ) f; instead of: int x = * ( int * ) &f; Any known difference ? Pierre |
From: Alen L. <ale...@cr...> - 2003-12-29 11:30:58
|
> Yep: > int _fpclass(double); Yes of course. But, the purist argument was "this is not portable". Heck, I think checking out the bits of *(int*)&f is much more portable than _fpclass(). ;) > BTW I tend to write this: > > int x = ( int & ) f; > > instead of: > > int x = * ( int * ) &f; > > Any known difference ? AFAIK, references are just syntactic sugar, so I see no reason why there should be a semantic difference. But then again, recasting across a char* before casting to int* shouldn't make a difference either (logically), but according to the standard it does. :/ Anyway, using something like: inline int &FloatAsInt(float &f) { return (int&)f; } should allow you to fix it by putting a lot of #ifs in there if you have problems with particular compiler(s) later. -- Alen |
From: Brian H. <ho...@py...> - 2003-12-29 16:45:25
|
> int _fpclass(double); Not portable. The C99 standard introduced classification as a standard macro: int fpclassify(double x); Which returns one of the predefined classification macros (FP_INFINITE, etc.). These are expected to be in <math.h> I'm hoping/assuming that even "lightly compliant" C99 compilers will have these in them. It also specifies isfinite(), isnan(), etc. There is no portable way to recompress to a float16, since float16 is not portable. Brian |
From: Alen L. <ale...@cr...> - 2003-12-30 07:04:47
|
>There is no portable way to recompress to a float16, since float16 is >not portable. Surely, most hardware cannot work with float16 natively, but it is still useable for lossy compression of some data. As long as I define what I want my float16 to look like, it is not less portable than any other user defined type. Generally, trying to look at a float as if it's some kind of a black box that stores real numbers, is kinda silly. You have to be aware of its internal representation, or you can't work with it (without getting yourself into various precision losses, etc). And once you are aware, you should be able to access it. But I believe we all agree on that, and it's the standard that has a wrong approach here. Alen |
From: Eero P. <epa...@ko...> - 2003-12-30 11:18:58
|
Alen Ladavac wrote: > Generally, trying to look at a float as if it's some kind of a black box that > stores real numbers, is kinda silly. You have to be aware of its internal > representation, or you can't work with it (without getting yourself into > various precision losses, etc). And once you are aware, you should be able to > access it. But I believe we all agree on that, and it's the standard that has > a wrong approach here. > I think that the standard has the aliasing rules, not because the comittee wanted to deny float->integer-bit-pattern conversions but because the comittee wanted the compiler produce better code in some other common situations. For example for some floating point vector library it is useful for the compiler to be sure that none of the limit/index integer values change, when the floating point results are written out. I think the aliasing rules make it easier to compete with fortran compilers here. The problem with int x=*(int *)&f; is sort of side effect here. I don't really know for sure why there must be a problem, because the aliasing of the values is sort of obvious, and could be deduced separately. Maybe the comittee just wanted to keep the definitions simple? Eero |