From: Greg H. <humper@Graphics.Stanford.EDU> - 2001-12-14 19:17:17
|
> I think the solution will involve the tilesort SPU scanning the > display lists instructions for state changes and applying them > locally after the list executes on the servers so that the GL > state stays consistant. This is (IMHO) clearly the best solution to this problem. My basic suggestion is that each display list gets packed into TWO places: one that will be sent to the servers, and a "mirror" on the side that contains only commands that might affect the state. When the list is called, the call is broadcast (or sorted using the BBOX hint), and the state-changing list of commands is decoded through the tilesort API. Some of the commands could be morphed, such as changing glBitmap() calls to use NULL bitmaps so only the state side-effects are recorded (Homer: "Stupid side-effects"). This would, I think, solve the most generic display list usage, where each individual OpenGL command that an application is ever going to make is encapsulated in its own separate display list. I realize this is pathological, but I think my proposal would support display lists completely generally. It's also possible that while packing said display list buffers, we could determine that a display list contains nothing but geometry so that we could compute an object-space bounding box to be associated with the display list, so that calls to that display list could properly be bucketed. More pie-in-the-sky ideas. It was also be easy to make obvious peephole optimizations over the state-changing commands, like eliminating self-contained glPushMatrix()...glPopMatrix sequences, collapsing sequential calls to glColor(), etc. This has been on the back-burner for quite some time; I think it would be a worthwhile little project for people to investigate. Speaking of worthwhile projects, I'd really appreciate it if someone could take a look at exactly what would be involved in extending the state tracker beyond 32 clients. Currently we use a 32-bit word as a bitvector to identify which clients/contexts are dirty in the state tracker, in order to make context differencing as effecient as possible. It will be crucial in the future to relax this constraint. We could either do it with a more general bitvector implementation, or possibly some other representation (I vote for the former). This is a pretty simple change, but it touches a lot of code. Will someone volunteer to take a look at that? I'd really like to be able in the future to do runs on 64-128 nodes. -Greg |