[Alephmodular-devel] Copy-on-write in practice
Status: Pre-Alpha
Brought to you by:
brefin
From: Woody Z. I. <woo...@sb...> - 2003-01-14 04:36:31
|
I'd like to take one idea from my monstrous idea dump, copy-on-write (COW) game objects, and talk a little more about it. Remember, the goal is to be able to split off into a fake (predicted) game-state for one or more ticks, then later return to the original (real) game-state, with the game-update logic being essentially the same for predictive updates and real updates (and as similar to the current code as possible, for practical reasons). (Sounds could be sticky, but I'll put that problem off till later.) I see (at least) three basic options. OPTION 1. Put all the COW functionality into the lookup routines An object a refers to another object b by its index (and, implicitly, its type) like it does currently. When a wants to read b's data structure fields, it calls get_btype_by_index(bindex); and reads the returned structure. When a wants to write b's data structure fields, it calls get_writable_btype_by_index(bindex); and may read or write the returned structure. The lookup routines go something like: btype* get_writable_btype_by_index(bindex) { if(game_state_mode == predictive_mode) { if(predictive_btypes[bindex] == NULL) predictive_btypes[bindex] = shallow_copy_btype(real_btypes[bindex]); return predictive_btypes[bindex]; } else return real_btypes[bindex]; } const btype* get_btype_by_index(bindex) { if(game_state_mode == predictive_mode && predictive_btypes[bindex] != NULL) return predictive_btypes[bindex]; else return real_btypes[bindex]; } void discard_predictive_btypes() { for(int i = 0; i < predictive_btype_count; i++) { if(predictive_btypes[i] != NULL) { dispose_btype(predictive_btypes[i]); predictive_btypes[i] = NULL; } } predictive_btype_count = real_btype_count; } This only does the testing and lookup when asking for another object, so there's not a terribly great performance hit probably. OTOH there's a great potential for disaster, e.g. if we're in predictive mode, and one routine acquires a pointer via get_foo_by_index(k), calls {another routine which acquires a pointer via get_writable_foo_by_index(k) and makes some changes}, and then reads fields of its foo with its old pointer, it's working with stale data. OPTION 2. Put COW functionality in objects and field accessors template <typename tEmulatedType> class COWObject { public: void discardPredictiveVersion() { delete predictive_self; predictive_self = NULL; } protected: const tEmulatedType* self() { return (game_state_mode == predictive_mode && predictive_self != NULL) ? predictive_self : real_self; } tEmulatedType* writable_self(); tEmulatedType* real_self; tEmulatedType* predictive_self; }; template <typename tEmulatedType> tEmulatedType* COWObject::writable_self() { if(game_state_mode == real_mode) return real_self; else { if(predictive_self == NULL) { predictive_self = real_self->shallow_copy(); CentralAuthority()->registerPredictiveObject(this); } return predictive_self; } } class Player : public COWObject<PlayerState> { public: int getValue() { return self()->value; } void setValue(int inValue) { writable_self()->value = inValue; } }; (The CentralAuthority holds a list of all the objects with predictive state, and tells them all to discard_predictive_version() when the game logic asks it to.) This way code that uses game objects essentially just uses them, reading or writing individual fields in a natural way (that also allows for additional 'hooking'). Each such use incurs some overhead for locating the appropriate object, but there's little to no chance for inconsistency as in the example above. OPTION 3. Arrange game-state structures explicitly in memory and bulk-copy In this scheme, all the game-state code is grouped together in a fairly tightly-packed, well-defined region of memory. All access to game-state objects is indexed off a base address. On EnterPredictiveMode(), all the game-state code is copied in bulk (with memcpy, some kind of hardware blitter, operating-system virtual-memory-system mark-page-as-COW support, etc.) to another chunk of memory. The base address for game-state-object indexing (see above) is changed to point at this new block. Bulk copying *should* be (significantly?) quicker than the too-slow copy-the-whole-game-state approach I took, which effectively produced a saved-game (without packing the fields, but essentially using the same code paths nonetheless). Since the state-change won't happen in the middle of a game-state update, there's no chance for inconsistency. All the overhead is taken up in the bulk copy; actual use of the objects is essentially identical to the current scheme, so there's no additional overhead in reading/writing fields or asking for objects. On ExitPredictiveMode(), of course, the copy is simply deallocated and the base pointer set back to the "real_mode" chunk of memory. Cheap cheap, fun fun. Note that in general I think netgames would change mode upwards (from real to predictive) once per rendered frame probably, whereas single-player games and films would not need to. OTOH the same mechanism (basically) would probably be used for between-frame interpolation (for smoothing out the animation, enabling frame-rates higher than 30fps), which would need a mode-change upwards once per rendered frame (potentially in addition to the predictive mode change in a netgame) in all circumstances (unless the machine is not even keeping up with 30fps, in which case it's nearly pointless to interpolate between the 30 ticks per second we're already computing). I think this summarizes the characteristics of these alternatives: Approach - Runtime overhead (time) - Runtime overhead (memory) - Changes required to existing code - Chances for (insidious) bugs (1) - Low - Low - Low - High (2) - Medium - Low - High - Low (3) - High (potentially) - Medium - Low - Low Thoughts? Maybe I'll try to estimate the game-state size, and measure how long it would take to bulk-copy it on my machines, to get *some* idea of the feasibility. Woody |