From: James S. <arr...@gm...> - 2006-08-07 11:17:05
|
Tilesort generates needed network traffic? I have four Render SPUs and one Tilesort SPU. I have noticed that some applications run slowly when they are rendering only to the local Render SPU (ig. application, tilesort and render SPU are all on one host) while at the same time using little CPU time. I replaced the 3 Render SPUs on remote hosts with NOP SPUs on the localhost and found that CPU consumption went up to 100% and applications ran faster. Suspecting that something about the remote Render SPUs might be slowing down local rendering I inserted a Print SPU on one of the remote nodes (note that the applications window is only on the local X server, none of the remote X server/Render SPUs should be used). I ran the atlantis text program and got this output: LoadIdentity( ) MatrixMode( GL_PROJECTION ) LoadMatrixf( [ 1.37 0.00 0.00 0.00 0.00 2.75 0.00 0.00 0.00 0.00 -1.05 -1.00 0.00 0.00 -20512.82 0.00 ] ) MatrixMode( GL_MODELVIEW ) MatrixMode( GL_MODELVIEW ) PushMatrix( ) LoadIdentity( ) PopMatrix( ) MatrixMode( GL_MODELVIEW ) Clear( 0x4100 ) LoadIdentity( ) SwapBuffers( 1, 0 ) MakeCurrent( 1, 0, 1 ) LoadIdentity( ) MatrixMode( GL_PROJECTION ) LoadMatrixf( [ 1.37 0.00 0.00 0.00 0.00 2.75 0.00 0.00 0.00 0.00 -1.05 -1.00 0.00 0.00 -20512.82 0.00 ] ) MatrixMode( GL_MODELVIEW ) MatrixMode( GL_MODELVIEW ) PushMatrix( ) LoadIdentity( ) PopMatrix( ) MatrixMode( GL_MODELVIEW ) Clear( 0x4100 ) LoadIdentity( ) SwapBuffers( 1, 0 ) MakeCurrent( 1, 0, 1 ) LoadIdentity( ) MatrixMode( GL_PROJECTION ) LoadMatrixf( [ 1.37 0.00 0.00 0.00 0.00 2.75 0.00 0.00 0.00 0.00 -1.05 -1.00 0.00 0.00 -20512.82 0.00 ] ) MatrixMode( GL_MODELVIEW ) MatrixMode( GL_MODELVIEW ) PushMatrix( ) LoadIdentity( ) PopMatrix( ) MatrixMode( GL_MODELVIEW ) Clear( 0x4100 ) LoadIdentity( ) SwapBuffers( 1, 0 ) MakeCurrent( 1, 0, 1 ) LoadIdentity( ) MatrixMode( GL_PROJECTION ) LoadMatrixf( [ 1.37 0.00 0.00 0.00 0.00 2.75 0.00 0.00 0.00 0.00 -1.05 -1.00 0.00 0.00 -20512.82 0.00 ] ) MatrixMode( GL_MODELVIEW ) MatrixMode( GL_MODELVIEW ) PushMatrix( ) LoadIdentity( ) PopMatrix( ) The Tilesort SPU is clearly sending data to the remote Render SPUs. Why? I am using the Test All Tiles bucket mode, I am using auto_dlist_bbox, lazy_send_dlists, and dlist_state_tracking. Why is this data being sent? How can I eliminate it? Thank you for your time, James Steven Supancic III |
From: Brian P. <bri...@tu...> - 2006-08-07 14:36:33
|
James Supancic wrote: > Tilesort generates needed network traffic? > > I have four Render SPUs and one Tilesort SPU. I have noticed that some > applications run slowly when they are rendering only to the local Render > SPU (ig. application, tilesort and render SPU are all on one host) while > at the same time using little CPU time. I replaced the 3 Render SPUs on > remote hosts with NOP SPUs on the localhost and found that CPU > consumption went up to 100% and applications ran faster. Suspecting that > something about the remote Render SPUs might be slowing down local > rendering I inserted a Print SPU on one of the remote nodes (note that > the applications window is only on the local X server, none of the > remote X server/Render SPUs should be used). > > I ran the atlantis text program and got this output: > LoadIdentity( ) > MatrixMode( GL_PROJECTION ) > LoadMatrixf( [ 1.37 0.00 0.00 0.00 > 0.00 2.75 0.00 0.00 > 0.00 0.00 -1.05 -1.00 > 0.00 0.00 -20512.82 0.00 ] ) > MatrixMode( GL_MODELVIEW ) > MatrixMode( GL_MODELVIEW ) > PushMatrix( ) > LoadIdentity( ) > PopMatrix( ) > MatrixMode( GL_MODELVIEW ) > Clear( 0x4100 ) > LoadIdentity( ) > SwapBuffers( 1, 0 ) > MakeCurrent( 1, 0, 1 ) > LoadIdentity( ) > MatrixMode( GL_PROJECTION ) > LoadMatrixf( [ 1.37 0.00 0.00 0.00 > 0.00 2.75 0.00 0.00 > 0.00 0.00 -1.05 -1.00 > 0.00 0.00 -20512.82 0.00 ] ) > MatrixMode( GL_MODELVIEW ) > MatrixMode( GL_MODELVIEW ) > PushMatrix( ) > LoadIdentity( ) > PopMatrix( ) > MatrixMode( GL_MODELVIEW ) > Clear( 0x4100 ) > LoadIdentity( ) > SwapBuffers( 1, 0 ) > MakeCurrent( 1, 0, 1 ) > LoadIdentity( ) > MatrixMode( GL_PROJECTION ) > LoadMatrixf( [ 1.37 0.00 0.00 0.00 > 0.00 2.75 0.00 0.00 > 0.00 0.00 -1.05 -1.00 > 0.00 0.00 -20512.82 0.00 ] ) > MatrixMode( GL_MODELVIEW ) > MatrixMode( GL_MODELVIEW ) > PushMatrix( ) > LoadIdentity( ) > PopMatrix( ) > MatrixMode( GL_MODELVIEW ) > Clear( 0x4100 ) > LoadIdentity( ) > SwapBuffers( 1, 0 ) > MakeCurrent( 1, 0, 1 ) > LoadIdentity( ) > MatrixMode( GL_PROJECTION ) > LoadMatrixf( [ 1.37 0.00 0.00 0.00 > 0.00 2.75 0.00 0.00 > 0.00 0.00 -1.05 -1.00 > 0.00 0.00 -20512.82 0.00 ] ) > MatrixMode( GL_MODELVIEW ) > MatrixMode( GL_MODELVIEW ) > PushMatrix( ) > LoadIdentity( ) > PopMatrix( ) > > The Tilesort SPU is clearly sending data to the remote Render SPUs. Why? > I am using the Test All Tiles bucket mode, I am using auto_dlist_bbox, > lazy_send_dlists, and dlist_state_tracking. Why is this data being sent? > How can I eliminate it? I think the state tracker's matrix-related code needs to be reviewed. I've noticed this too, but it doesn't seem to effect all applications. The atlantis demo does call glMatrixMode() and sets the modelview and projection matrices on every frame but the state tracker should filter out those redundant calls. I won't have time to look into this anytime soon, unfortunately. -Brian |
From: James S. <arr...@gm...> - 2006-08-08 12:13:47
|
I have been working on this problem from the bottom up. I have noticed that all the data being sent to the remote hosts is comeing from the call to tilesortspuShipBuffers in tilesortspu_SwapBuffers. I put tilesortspuShipBuffers(); return; at the begining of tilesortspu_SwapBuffer and I still get the same data spamed to all Render SPUs. SwapBuffers( 1, 0 ) is one line repleatedly printed by the Print SPU I inserted between the tilesort and Render SPUs. Now I know that tilesortspu_SwapBuffers.isn't inserting the SwapBuffers( 1, 0 ) command. What else could be inserting the tilesortspu_SwapBuffers command into the CRPackBuffer? What is the best method to determine which function(s) is/are inserting a command into a CRPackBuffer? Thank you for your time, James Steven Supancic III |
From: Brian P. <bri...@tu...> - 2006-08-08 14:32:07
|
James Supancic wrote: > I have been working on this problem from the bottom up. I have noticed > that all the data being sent to the remote hosts is comeing from the > call to tilesortspuShipBuffers in tilesortspu_SwapBuffers. I put > tilesortspuShipBuffers(); > return; > at the begining of tilesortspu_SwapBuffer and I still get the same data > spamed to all Render SPUs. > SwapBuffers( 1, 0 ) > is one line repleatedly printed by the Print SPU I inserted between the > tilesort and Render SPUs. Sure, because SwapBuffers is called after each frame is rendered. > Now I know that tilesortspu_SwapBuffers.isn't inserting the SwapBuffers( > 1, 0 ) command. Why do you say that? Line 87 of tilesortspu_swap.c is a call to crPackSwapBuffers(). > What else could be inserting the tilesortspu_SwapBuffers command into > the CRPackBuffer? > > What is the best method to determine which function(s) is/are inserting > a command into a CRPackBuffer? Does gdb take wildcards for breakpoints? If so, you could use 'break crPack*'. -Brian |
From: James S. <arr...@gm...> - 2006-08-08 14:45:16
|
>Why do you say that? Line 87 of tilesortspu_swap.c is a call to >crPackSwapBuffers(). Opps, I was mis-reading my debuger ouptut. I am also a bit confused about this. tilesortspu_SwapBuffer calls this function: void PACK_APIENTRY crPackSwapBuffers( GLint window, GLint flags ) { GET_PACKER_CONTEXT(pc); unsigned char *data_ptr; (void) pc; GET_BUFFERED_POINTER( pc, 16 ); WRITE_DATA( 0, GLint, 16 ); WRITE_DATA( 4, GLenum, CR_SWAPBUFFERS_EXTEND_OPCODE ); WRITE_DATA( 8, GLint, window ); WRITE_DATA( 12, GLint, flags ); WRITE_OPCODE( pc, CR_EXTEND_OPCODE ); } to put the SwapBuffers command into the buffer, and it eventually invokes tilesortspuSendServerBufferThread once per server to send the buffers? In the tilesortspuSendServerBufferThread function it looks as if each server has its own buffer, but crPackSwapBuffers looks like it is just writting to one buffer? Am I missing something? Does crPackSwapBuffers get called more than once? Is it somehow packing multiple buffers? Is there a global buffer that is somehow combined with server specific buffers at time of send? Thank you for your time, James Steven Supancic III |
From: Brian P. <bri...@tu...> - 2006-08-08 14:55:24
|
James Supancic wrote: > >Why do you say that? Line 87 of tilesortspu_swap.c is a call to > >crPackSwapBuffers(). > > Opps, I was mis-reading my debuger ouptut. > > I am also a bit confused about this. tilesortspu_SwapBuffer calls this > function: > void PACK_APIENTRY crPackSwapBuffers( GLint window, GLint flags ) > { > GET_PACKER_CONTEXT(pc); > unsigned char *data_ptr; > (void) pc; > GET_BUFFERED_POINTER( pc, 16 ); > WRITE_DATA( 0, GLint, 16 ); > WRITE_DATA( 4, GLenum, CR_SWAPBUFFERS_EXTEND_OPCODE ); > WRITE_DATA( 8, GLint, window ); > WRITE_DATA( 12, GLint, flags ); > WRITE_OPCODE( pc, CR_EXTEND_OPCODE ); > } > > to put the SwapBuffers command into the buffer, and it eventually > invokes tilesortspuSendServerBufferThread once per server to send the > buffers? Right. > In the tilesortspuSendServerBufferThread function it looks as if each > server has its own buffer, but crPackSwapBuffers looks like it is just > writting to one buffer? Am I missing something? Does crPackSwapBuffers > get called more than once? Is it somehow packing multiple buffers? Is > there a global buffer that is somehow combined with server specific > buffers at time of send? In the case of the loop over tilesortspuSendServerBuffer(), we're sending the contents of the same buffer to all servers. There's a number of different packing buffers in the Tilesort SPU. Some hold geometry commands and are sent to one or more servers. Some buffers contain state-change commands and are sent to individual servers. The code is pretty complicated since it's evolved a lot over the years. It would be nice to overhaul it someday. -Brian |
From: James S. <arr...@gm...> - 2006-08-09 11:14:13
|
I am trying to figure out how this all fits together. Right now I know so little about the interactions between the various components that I can't begin to figure out how to fix things. I have been looking at the packer source code, I can't figure out how the buffer to write to is set? I see macros that write data, and I see a macro to grab the packer context. From the packer context the functions are able to get a pointer a buffer. Normally write functions take some kind of stream identifier, but Chromium appears to use a global variable rather than a function argument. This makes tracking down exactly how the buffer to write to gets set a bit tricky. Do you have any idea how I can figure out were it is getting set? I am not sure how much of the problem I am experiencing has to do with the state tracker. I don't think we should have to send anything at all to a Render SPU if the window being rendered to is not on that SPUs server. I can't be sure, as I am very confused about how the buffer to be written gets selected, but it looks as if the SwapBuffers command is being sent to all Render SPUs unconditionally in tilesortspu_SwapBuffers rather than to just the Render SPUs that have the Render window on them? Is there a technical reason for this, or has no one got around to optimizing it yet? How does the State Tracker interact with the rest of Chromium? To me it appears as if it is used to track the difference between the OpenGL state on the tilesort SPU and the OpenGL state on the Render SPU. Does it do anything else? To me it appears that the State Tracker is simply something else uses to keep track of the state defences, it looks as if something else is actually making the choice to not send something to some Render SPUs and then using the State Tracker to keep track the the results of this choice? If this is correct, what code is actually making this choice? Thank you for your time, James Steven Supancic III |
From: Brian P. <bri...@tu...> - 2006-08-09 15:24:44
|
James Supancic wrote: > I am trying to figure out how this all fits together. Right now I know > so little about the interactions between the various components that I > can't begin to figure out how to fix things. > > I have been looking at the packer source code, I can't figure out how > the buffer to write to is set? crPackSetBuffer. > I see macros that write data, and I see > a macro to grab the packer context. From the packer context the > functions are able to get a pointer a buffer. Normally write functions > take some kind of stream identifier, but Chromium appears to use a > global variable rather than a function argument. This makes tracking > down exactly how the buffer to write to gets set a bit tricky. Do you > have any idea how I can figure out were it is getting set? You're right that a global var for the current packer context is used. When built thread safe, it's a per-thread pointer. > I am not sure how much of the problem I am experiencing has to do with > the state tracker. I don't think we should have to send anything at > all to a Render SPU if the window being rendered to is not on that > SPUs server. You don't have to send anything if the primitive you're rendering doesn't interesect the Render SPU's window (in terms of projected window coordinates). > I can't be sure, as I am very confused about how the > buffer to be written gets selected, but it looks as if the SwapBuffers > command is being sent to all Render SPUs unconditionally in > tilesortspu_SwapBuffers rather than to just the Render SPUs that have > the Render window on them? Is there a technical reason for this, or > has no one got around to optimizing it yet? That's intentional. Recall that you can make a whole chain of SPUs. Some SPUs depend on getting SwapBuffer calls. If we omitted some SwapBuffer calls, we'd cause trouble for those other SPUs. The VNC SPU is an example. > How does the State Tracker interact with the rest of Chromium? To me > it appears as if it is used to track the difference between the OpenGL > state on the tilesort SPU and the OpenGL state on the Render SPU. Does > it do anything else? To me it appears that the State Tracker is simply > something else uses to keep track of the state defences, it looks as > if something else is actually making the choice to not send something > to some Render SPUs and then using the State Tracker to keep track the > the results of this choice? If this is correct, what code is actually > making this choice? That's basically correct. The tilesort bucketing code determines which crservers need to receive any given primitive. Before sending the primitive, the state differencer is invoked to update each crserver with whatever state it needs to become up-to-date. Some good background material is the original WireGL white paper and the Chromium paper from SIGGRAPH '02 (I think). You should definitely read those. I think there's also some Stanford papers detailing the state differencer and packer/unpacker. -Brian |
From: James S. <arr...@gm...> - 2006-08-09 17:02:55
|
> That's intentional. Recall that you can make a whole chain of SPUs. > Some SPUs depend on getting SwapBuffer calls. If we omitted some > SwapBuffer calls, we'd cause trouble for those other SPUs. The VNC > SPU is an example. I don't understand, if nothing is to be rendered to a tile, what do the SPUs for that tile need to get a SwapBuffer call? Why do they need to receive SwapBuffer calls for frames that they have nothing to do with? For example, If I am running the city demo on a two display setup, and the city demo's window is on one tile, why do I need to be sending anything to the other tile? Are you saying that without the SwapBuffer call other data that is sent to the tile not part of the current frame will "pollute" the first frame that it should render after becoming part of the frame (if a window move or something causes a re tiling)? I can see how that is possible, but ideally, I would filter out everything being sent to the SPUs for a tile that is not part of the scene.... Or are other SPUs using it as some kind of periodic "interrupt" mechanism? > Some good background material is the original WireGL white paper and > the Chromium paper from SIGGRAPH '02 (I think). You should definitely > read those. I think there's also some Stanford papers detailing the > state differencer and packer/unpacker. I read one Stanford paper on the state tracker. I will take a look at the SIGGRAPH papers on Chromium. Thank you for your time, James Steven Supancic III |
From: Brian P. <bri...@tu...> - 2006-08-09 23:41:15
|
James Supancic wrote: >> That's intentional. Recall that you can make a whole chain of SPUs. >> Some SPUs depend on getting SwapBuffer calls. If we omitted some >> SwapBuffer calls, we'd cause trouble for those other SPUs. The VNC >> SPU is an example. > > > I don't understand, if nothing is to be rendered to a tile, what do > the SPUs for that tile need to get a SwapBuffer call? Why do they need > to receive SwapBuffer calls for frames that they have nothing to do > with? For example, If I am running the city demo on a two display > setup, and the city demo's window is on one tile, why do I need to be > sending anything to the other tile? In the case of the VNC SPU, we need to see all SwapBuffers in order to do frame synchronization. In another example, if you have a number of NVIDIA Quadro cards and are using the framesync feature, it may cause trouble to do SwapBuffers on some cards, but not others. > Are you saying that without the SwapBuffer call other data that is > sent to the tile not part of the current frame will "pollute" the > first frame that it should render after becoming part of the frame (if > a window move or something causes a re tiling)? Not sure I understand that. But if all one does is call glClearColor(); glClear(), you'll need to do the buffer swap. > I can see how that is possible, but ideally, I would filter out > everything being sent to the SPUs for a tile that is not part of the > scene.... > > Or are other SPUs using it as some kind of periodic "interrupt" mechanism? Kind of. The FPS SPU uses SwapBuffers to measure Frames/Second. In any case, sending the SwapBuffers to all crservers is pretty cheap. -Brian |