RE: [Chromium-dev] question about "Pre" state

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

> I think the solution will involve the tilesort SPU scanning the
> display lists instructions for state changes and applying them
> locally after the list executes on the servers so that the GL
> state stays consistant.

This is (IMHO) clearly the best solution to this problem.  My basic
suggestion is that each display list gets packed into TWO places:  one that
will be sent to the servers, and a "mirror" on the side that contains only
commands that might affect the state.  When the list is called, the call is
broadcast (or sorted using the BBOX hint), and the state-changing list of
commands is decoded through the tilesort API.  Some of the commands could be
morphed, such as changing glBitmap() calls to use NULL bitmaps so only the
state side-effects are recorded (Homer: "Stupid side-effects").  This would,
I think, solve the most generic display list usage, where each individual
OpenGL command that an application is ever going to make is encapsulated in
its own separate display list.  I realize this is pathological, but  I think
my proposal would support display lists completely generally.  It's also
possible that while packing said display list buffers, we could determine
that a display list contains nothing but geometry so that we could compute
an object-space bounding box to be associated with the display list, so that
calls to that display list could properly be bucketed.  More pie-in-the-sky
ideas.

It was also be easy to make obvious peephole optimizations over the
state-changing commands, like eliminating self-contained
glPushMatrix()...glPopMatrix sequences, collapsing sequential calls to
glColor(), etc.

This has been on the back-burner for quite some time; I think it would be a
worthwhile little project for people to investigate.

Speaking of worthwhile projects, I'd really appreciate it if someone could
take a look at exactly what would be involved in extending the state tracker
beyond 32 clients.  Currently we use a 32-bit word as a bitvector to
identify which clients/contexts are dirty in the state tracker, in order to
make context differencing as effecient as possible.  It will be crucial in
the future to relax this constraint.  We could either do it with a more
general bitvector implementation, or possibly some other representation (I
vote for the former).

This is a pretty simple change, but it touches a lot of code.  Will someone
volunteer to take a look at that?  I'd really like to be able in the future
to do runs on 64-128 nodes.

	-Greg