RE: [Algorithms] Network & byte order.
Brought to you by:
vexxed72
From: Aaron D. <ri...@ho...> - 2000-08-18 21:38:39
|
While it doesn't really matter which order you choose so long as both sides use it, network standard byte order is bigendian. There are a number of C macro functions that convert short and long values to this format. (htons, htonl, ntohs, ntohl from memory) Floats, as mentioned by Pierre can be a bit more of a headache. I generally tend to avoid them if at all possible. If I require decimal values they've usually been of fixed precision so I've simply sent them as integer using a common divisor on both sides. > -----Original Message----- > From: gda...@li... > [mailto:gda...@li...]On Behalf Of Kent > Quirk > Sent: Friday, August 18, 2000 11:50 PM > To: gda...@li... > Subject: Re: [Algorithms] Network & byte order. > > > > Lionel Fumery wrote: > > We are designing our network libraries, for our next games. We > would like to > > produce cross-platfom games, with misc processors targets. > > > > In the case of multi-platform network game, we wonder if we > have to consider > > the byte-ordering of the platforms... Intel is little-endian, > whereas Apple > > (Motorola) is beg-endian. > > Anybody can tell us what platforms are little-endian, or big-endian? > > > > If all our target-platforms are little-endian, we could avoid this > > byte-swaping and then keep some CPU time for something else... > > Compared to the time spent on the network, the amount of time you'll > spend byte-swapping is so microscopic as to be invisible. > > General rule of thumb: > Don't expect to write a multibyte value as a stream of binary bytes on > one platform and expect to read it in on another and have it work. > Define your formats in a way that's independent of byte order. Either > use a text value (XML, for example) or if you need to keep the data at > minimal size (and in modem-based networking you usually do) then define > your data formats at the byte level. > > Don't say: > > "The header consists of a 4-byte unsigned int packet ID." > > say: > > "The first 4 bytes of the header are a packet ID, sent as a four byte > integer, least-significant byte first." > > Then it's unambiguous what you're doing. > > With that said, I just found some comments in the header of one of the > files on our file format (called CHUFF) in MindRover. They were written > by Nat Goodspeed, who works here: > > ------------------------------------ > WARNING! For efficiency reasons, the read/write implementations for > types > such as 'bin4' are implemented by directly examining the storage used > for the > native-type variable. This is fast, but is inherently > platform-sensitive. > CHUFF data types are little-endian by definition (so that we can have > some > hope of exchanging files between different platforms). Therefore, when > you > port this implementation to a big-endian machine, make SURE you define > 'HIBYTE1ST' as one of the compiler's command-line switches! > > Our byte-swapping big-endian implementations assume that it's still > cheaper to > make a single I/O method call for the full size of the value, exchanging > bytes > using temporary variables in memory, than it is to break out separate > I/O > operations for each byte. That may not be entirely true. But one > advantage > of this scheme is that on input, we can still test for EOF on a single > call, > rather than having to test separately for each byte. > > There are two different philosophical approaches to implementing a > cross-platform binary format, that is, one such as ours, in which (for > instance) bin4 must be read and written as little-endian, regardless of > the > byte order in which the platform on which we're running normally stores > its > binary integers. > > Convert on Use > -------------- > One approach is to implement a family of classes that literally define > the way > the storage will be used. For instance, bin4 could be defined as a > class > which always contains a little-endian binary integer value. We would > then > define conversions to and from ordinary binary integers, arithmetic and > logical operations, etc., so that any operation on a bin4 object results > in a > little-endian value in memory. > > The advantage of this strategy is that such fields can apparently be > composed > into structs that describe the actual byte stream. In theory you can > then > instantiate such a struct, populate some or all of its fields and just > write > it out -- or, conversely, read the struct in its entirety (or even just > map > the struct onto part of a previously-read buffer) and then just > reference some > of its fields. > > In practice, this is complicated considerably by the need to worry about > platform-dependent struct alignment requirements. But you can still > build it, > even though you sometimes end up having to define the actual data as an > array > of bytes to bypass automatic compiler alignment. > > With this approach, you need to spend considerable development time on > each > individual field type; it must support the full suite of arithmetic and > logical operations you intend to use. Those operations are, of course, > somewhat more expensive than operations on the corresponding native > type. But > this can still be a win if: > > (a) there are very many more cross-platform structs than there are field > types. The whole rationale for this approach is that you do NOT need to > implement read/write methods for each different struct; composing such > fields > into structs should then permit the structs to be transparently used on > a byte > stream. > > (b) there are very many more fields in a typical cross-platform struct > than > you actually use. (In such a case, you might consider redesigning your > protocol, since it appears to be wasteful of space!) But if you have to > live > with a protocol definition like that, the tradeoff might work in your > favor: > with these fields, you pay for the conversion each time you use them, > but you > don't have to pay for converting fields that you don't use at all. > > (c) for some reason, you need random access to parts of a buffer. For > instance, you are filling a transmission buffer with such structs, but > the > protocol requires a header struct that describes how many other structs > follow > it, and it would be expensive or impossible to determine that number in > advance. If you have a pointer to the header struct in the buffer, you > can > simply patch the count field on the fly. > > Convert on I/O > -------------- > The other approach is to define fields that store values much like > native C++ > types, so that it's reasonably easy and cheap to perform arithmetic and > logical > operations on them, but each field knows how to serialize and > deserialize > itself to a data stream. > > Since the conversion of each field to and from a byte stream is > explicit, you > have explicit control over such things as alignment, rather than > worrying > about what the compiler might be doing behind your back. This approach > also > allows you to use C++ classes with virtual functions, which you can't do > with > a convert-on-use mechanism since the VFT pointer is part of the storage > occupied by each class object. > > The drawback is that for each struct or class you intend to write to, or > receive from, a cross-platform data stream, you must implement specific > read/write methods that enumerate all the (persistent) fields in that > struct > or class. These methods must be maintained every time you change the > set of > fields in the struct/class. > > This can be a win if: > > (a) there are relatively few predefined structs in the protocol. > Implementing > a small set of read/write methods can be easier than implementing all > the > support methods for each convert-on-use field type. > > (b) you access the fields in your structs much more often than you > de/serialize > them from/to the data stream. You only pay for conversion at the time > you > actually read or write the fields, rather than every time you touch one > of > them. > > (c) your protocol allows you to write header information and proceed, > rather > than needing to go back and revisit the header to fix up one or more of > its > fields. That is, either protocol headers don't need to make assertions > about > the data that follows, or it's relatively easy to derive that > forward-looking > information. > > I was going to say something about dynamic composition -- the case when > you > want to read or write individual fields in an order determined at > runtime > rather than at compile time -- but actually, I think that would probably > work > out equally well either way. > > In any case, we use the convert-on-I/O approach. bin4 and friends store > data > very much like C long int, etc., but they know how to read and write > themselves from/to a data stream. > > However, for internal purposes, we find it useful to borrow a > convert-on-use > notion: within this file, we implement a LittleEndian type that always > maintains data in little-endian form. > > -------------------------- > > Hope this helps. > > Kent > > -- > ----------------------------------------------------------------------- > Kent Quirk | CogniToy: Intelligent toys... > Game Architect | for intelligent minds. > ken...@co... | http://www.cognitoy.com/ > _____________________________|_________________________________________ > > _______________________________________________ > GDAlgorithms-list mailing list > GDA...@li... > http://lists.sourceforge.net/mailman/listinfo/gdalgorithms-list > |