From: Gareth H. <ga...@va...> - 2001-03-12 15:07:47
|
Stephen J Baker wrote: > > On 9 Mar 2001, Josh Vanderhoof wrote: > > > Of course, you do have to take a start up penalty with run-time > > compiled code. Considering that the average system is around 700MHz > > (just a guess) and getting faster every day, I think people may be > > overestimating how expensive using a real compiler would be. > > The "average" system with a 700MHz CPU also has a kick-ass graphics > card that makes this discussion largely irrelevent. If software-only > rendering has any kind of a future at all, it's in PDA's, phones and > internet-capable toasters...where the overhead of having a compiler > on board at all (let alone actually running it on anything) tends to > be unacceptable. Which is why I've been focusing on code generation for hardware drivers, particularly the begin/end functions used in immediate mode rendering. With hardware T&L, you basically want the non-glVertex* functions to write directly to the "current" hardware-format vertex, with glVertex flushing this to a DMA buffer. There isn't really much a compiler can do with this: struct foo_vertex_o3n3tc2 { GLfloat obj[3]; GLfloat normal[3]; GLfloat tc[2]; } void foo_Normal3fv( const GLfloat *v ) { GET_FOO_CONTEXT(fmesa); COPY_3V( fmesa->current.o3n3t2.normal, v ); } void foo_Vertex3fv( const GLfloat *v ) { GET_FOO_CONTEXT(fmesa); COPY_3V( fmesa->current.o3n3t2.obj, v ); if ( fmesa->dma.space >= 8 ) { COPY_DWORDS( fmesa->dma.head, fmesa->current.o3n3tc2, 8 ); fmesa->dma.head += 8; fmesa->dma.space -= 8; } else { fmesa->get_dma( fmesa, fmesa->current.o3n3tc2, 8 ); } } (The above is based on code by Keith Whitwell) You can, however, substitute most of that with hard-coded addresses for the current context and make it as streamlined as possible. If you want to call these functions 10, 30, 100 million times a second, you want them to be *fast*... -- Gareth |