From: F. <j_r...@ya...> - 2003-03-10 20:16:59
|
I'm just giving a quick update of the Mesa C++ wrappers, in case anybody wants to discuss it a little in today's meeting. Wrappers C functions to all the GL context callbacks are already in place at http://jrfonseca.dyndns.org/projects/dri/cpp/mesa.cxx . (These were first generated by a little Python script and then edited by hand for a few special cases). Wrappers classes for GL visuals and framebuffers are done. I'm moving all (Compressed)?Tex(Sub)?Image[1-3]D functions to texture methods, rather than context methods. See http://jrfonseca.dyndns.org/projects/dri/cpp/mesa.hxx . There is a Makefile, and the mesa wrapper code already compiled without errors, although ATM I have some incomplete changes, so there a few errors. Once textures are finished, the most tricky will be the software rasterizer and the TnL module. For these my idea is to make the driver able to switch rasterizers and/or TnL modules in real time, with the its own hardware accelerated versions or the software versions. José Fonseca __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com |
From: Keith W. <ke...@tu...> - 2003-03-10 20:32:27
|
Jos=E9 Fonseca wrote: > I'm just giving a quick update of the Mesa C++ wrappers, in case anybod= y > wants to discuss it a little in today's meeting. >=20 > Wrappers C functions to all the GL context callbacks are already in > place at http://jrfonseca.dyndns.org/projects/dri/cpp/mesa.cxx . (These > were first generated by a little Python script and then edited by hand > for a few special cases). >=20 > Wrappers classes for GL visuals and framebuffers are done. I'm moving > all (Compressed)?Tex(Sub)?Image[1-3]D functions to texture methods, > rather than context methods. See > http://jrfonseca.dyndns.org/projects/dri/cpp/mesa.hxx . >=20 > There is a Makefile, and the mesa wrapper code already compiled without > errors, although ATM I have some incomplete changes, so there a few > errors. >=20 > Once textures are finished, the most tricky will be the software > rasterizer and the TnL module. For these my idea is to make the driver > able to switch rasterizers and/or TnL modules in real time, with the it= s > own hardware accelerated versions or the software versions. The driver already does this -- the tnl module is swapped in/out by the c= ode=20 in radeon_vtxfmt.c, the rasterizer is swapped by RADEON_FALLBACK(). Actually there's probably too much mechanism propping up the tnl module=20 swapping at the moment. I think a better approach would be just to swap = in a=20 whole new dispatch table when the vtxfmt code is viable. We could do OUTSIDE_BEGIN_END testing the same way for free. Keith |
From: F. <jrf...@tu...> - 2003-03-10 21:03:43
|
On Mon, Mar 10, 2003 at 08:32:18PM +0000, Keith Whitwell wrote: > José Fonseca wrote: > > >Once textures are finished, the most tricky will be the software > >rasterizer and the TnL module. For these my idea is to make the > >driver > >able to switch rasterizers and/or TnL modules in real time, with > >the its > >own hardware accelerated versions or the software versions. > > The driver already does this -- the tnl module is swapped in/out > by the code in radeon_vtxfmt.c, the rasterizer is swapped by > RADEON_FALLBACK(). Thank for pointing that out, as I didn't knew this - I though it just changed a few entry points in a callback table as done with swrast. The tnl module thing is still unknown territory for me as the embeded radeon drivers overrides the glapi dispatch table and emits DMA vertices buffers directly. > Actually there's probably too much mechanism propping up the tnl > module swapping at the moment. I think a better approach would be > just to swap in a whole new dispatch table when the vtxfmt code is > viable. What disptach table would you be referring to, glapi or the TnL one? The problem with disptach tables is that they completely break the OOP concept as they work with regular functions instead of object methods. What are the specific problems of the module swapping? > We could do OUTSIDE_BEGIN_END testing the same way for free. You lost me here... José Fonseca __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com |
From: Keith W. <ke...@tu...> - 2003-03-10 21:11:08
|
Jos=E9 Fonseca wrote: > On Mon, Mar 10, 2003 at 08:32:18PM +0000, Keith Whitwell wrote: >=20 >>Jos=E9 Fonseca wrote: >> >> >>>Once textures are finished, the most tricky will be the software >>>rasterizer and the TnL module. For these my idea is to make the=20 >>>driver >>>able to switch rasterizers and/or TnL modules in real time, with=20 >>>the its >>>own hardware accelerated versions or the software versions. >> >>The driver already does this -- the tnl module is swapped in/out=20 >>by the code in radeon_vtxfmt.c, the rasterizer is swapped by=20 >>RADEON_FALLBACK(). >=20 >=20 > Thank for pointing that out, as I didn't knew this - I though it just c= hanged a few entry points in a > callback table as done with swrast. The tnl module thing is still > unknown territory for me as the embeded radeon drivers overrides the > glapi dispatch table and emits DMA vertices buffers directly. >=20 >=20 >>Actually there's probably too much mechanism propping up the tnl=20 >>module swapping at the moment. I think a better approach would be=20 >>just to swap in a whole new dispatch table when the vtxfmt code is=20 >>viable. >=20 >=20 > What disptach table would you be referring to, glapi or the TnL one? Th= e > problem with disptach tables is that they completely break the OOP > concept as they work with regular functions instead of object methods. That's a problem with the OOP concept, then. Techniques based around=20 switching and updating dispatch tables are *the* way to do fast GL driver= s. > What are the specific problems of the module swapping? The current mechanism tries to make it fast to swap *portions* of a dispa= tch=20 table. Better just to maintain two different tables & switch between the= m. Of course, that doesn't mean that the tables are static -- incremental up= dates=20 of a dispatch table are a key mechanism to managing GL statechanges (whic= h is=20 underutilized in the current drivers). >=20 >>We could do OUTSIDE_BEGIN_END testing the same way for free. >=20 >=20 > You lost me here... In the simplest example you'd have two dispatch tables -- one for inside=20 begin/end, one for outside. Switch between them in Begin and End. The i= nside=20 one has 'Error' stubs for all state functions, and the tnl driver plugged= in.=20 The outside one has the state functions (which no longer need to check = for=20 OUTSIDE_BEGIN_END), and some special-case handlers for Color,Vertex,etc. Keith |
From: F. <jrf...@tu...> - 2003-03-10 22:22:31
|
On Mon, Mar 10, 2003 at 09:11:06PM +0000, Keith Whitwell wrote: > José Fonseca wrote: > >What disptach table would you be referring to, glapi or the TnL > >one? The > >problem with disptach tables is that they completely break the OOP > >concept as they work with regular functions instead of object > >methods. > > That's a problem with the OOP concept, then. Techniques based > around switching and updating dispatch tables are *the* way to do > fast GL drivers. My initial worry was that it's not safe (someone *please* correct me if I'm wrong) to put a C++ method in a C function callback, i.e., if you have: struct function_table { ... void (*BlendFunc)(GLcontext *ctx, GLenum sfactor, GLenum dfactor); ... } driver; and class Context { ... void BlendFunc(GLenum sfactor, GLenum dfactor); ... } ; You can't simply do driver.BlendFunc = Context::BlendFunc; or can you? Anyway, after I fully understood what you're proposing I realized this can be easily overcomed and even made easier with OOP + templates. > >What are the specific problems of the module swapping? > > The current mechanism tries to make it fast to swap *portions* of > a dispatch table. Better just to maintain two different tables & > switch between them. I see. > Of course, that doesn't mean that the tables are static -- > incremental updates of a dispatch table are a key mechanism to > managing GL statechanges (which is underutilized in the current > drivers). > > > > >>We could do OUTSIDE_BEGIN_END testing the same way for free. > > > > > >You lost me here... > > In the simplest example you'd have two dispatch tables -- one for > inside begin/end, one for outside. Switch between them in Begin > and End. The inside one has 'Error' stubs for all state > functions, and the tnl driver plugged in. The outside one has the > state functions (which no longer need to check for > OUTSIDE_BEGIN_END), and some special-case handlers for > Color,Vertex,etc. This is great. As I said above this can be done in C++, and without damage to efficiency. Imagine you have a TnL abstract class: class TNL { // A OpenGL function virtual void Coord3f(GLfloat x, GLfloat y, GLfloat z) = 0; // Activate virtual void activate() = 0; protected: struct dispatch_table *my_dispatch_table; } ; and then you have two inherited classes for software and hardware rendering: class SoftwareTNL : public TNL { // The software version. Note the _inline_ inline void Coord3f(x, y, z) { _mesa_swrast_deal_with_this_vertex(x, y, z); } }; class HardwareTNL : public TNL { // The hardware version. Note the _inline_ inline void Coord3f(x, y, z) { _add_vertex_to_DMA_buffer(x, y, z); } }; and then the C-callable versions for the glapi disptach table: void softwareCoord3f(GLcontext *ctx, GLfloat x, GLfloat y, GLfloat z) { Driver::Context *context = ctx; Driver::SoftwareTNL &tnl = ctx->tnl; // There will be no call as the function will be expanded inline tnl.Coord3F(x, y, z); } and the same for the hardware version... In the activate method we can either swap the dispatch table entirely, or update part of it: void SoftwareTNL::activate() { ... _glapi_set_table(softwareCoord3f); // or something similar } Note that we can even overload the operator= on TNL to automatically call "activate()". Templates will have to be use for the automatic generation of the C-callable versions. Then this would be mainly implementation details, which a driver wouldn't need to care for. Of course that I won't go in such depth now. These are efficiency details, but is good to know that we can address them later, without messing around with the design. José Fonseca __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com |
From: Felix <fx...@gm...> - 2003-03-10 23:34:05
|
On Mon, 10 Mar 2003 22:23:07 +0000 José Fonseca <jrf...@tu...> wrote: [snip] > As I said above this can be done in C++, and without damage to > efficiency. > > Imagine you have a TnL abstract class: > > class TNL { > // A OpenGL function > virtual void Coord3f(GLfloat x, GLfloat y, GLfloat z) = 0; > > // Activate > virtual void activate() = 0; > > protected: > struct dispatch_table *my_dispatch_table; > } ; > > and then you have two inherited classes for software and hardware > rendering: > > class SoftwareTNL : public TNL { > // The software version. Note the _inline_ > inline void Coord3f(x, y, z) { > _mesa_swrast_deal_with_this_vertex(x, y, z); > } > }; > > class HardwareTNL : public TNL { > // The hardware version. Note the _inline_ > inline void Coord3f(x, y, z) { > _add_vertex_to_DMA_buffer(x, y, z); > } > }; > > and then the C-callable versions for the glapi disptach table: > > void softwareCoord3f(GLcontext *ctx, GLfloat x, GLfloat y, GLfloat z) { > Driver::Context *context = ctx; > Driver::SoftwareTNL &tnl = ctx->tnl; > > // There will be no call as the function will be expanded inline > tnl.Coord3F(x, y, z); > } Here you're converting a GLcontext * to a Driver::Context *. Can you do that because Mesa::Context has GLcontext as first member? Anyway, if that didn't work you could always do some fancy pointer arithmetics with the offset of the GLcontext in Driver::Context. But then, if the GLcontext is a member (not a pointer) in Mesa::Context, then it would have to be created by the constructor of Mesa::Context. Just checked in radeonCreateContext, the GLcontext is really created there by calling _mesa_create_context. But _mesa_create_context returns a pointer to the GLcontext. You would need a _mesa_create_context which takes a GLcontext pointer as a parameter. Maybe I'm getting lost in details, but IME it's better to stumble sooner than later. Disclaimer: I support the idea of a C++ framework for DRI driver development. ;-) [snip] > José Fonseca Regards, Felix ------------ __\|/__ ___ ___ ------------------------- Felix ___\_e -_/___/ __\___/ __\_____ You can do anything, Kühling (_____\Ä/____/ /_____/ /________) just not everything fx...@gm... \___/ \___/ U at the same time. |
From: Keith W. <ke...@tu...> - 2003-03-10 23:50:26
|
Felix K=FChling wrote: > On Mon, 10 Mar 2003 22:23:07 +0000 > Jos=E9 Fonseca <jrf...@tu...> wrote: > [snip] >=20 >>As I said above this can be done in C++, and without damage to >>efficiency. >> >>Imagine you have a TnL abstract class: >> >>class TNL { >> // A OpenGL function >> virtual void Coord3f(GLfloat x, GLfloat y, GLfloat z) =3D 0; >> =20 >> // Activate >> virtual void activate() =3D 0; >> >> protected: >> struct dispatch_table *my_dispatch_table; >>} ; >> >>and then you have two inherited classes for software and hardware >>rendering: >> >>class SoftwareTNL : public TNL { >> // The software version. Note the _inline_ >> inline void Coord3f(x, y, z) { >> _mesa_swrast_deal_with_this_vertex(x, y, z); >> } >>}; >> >>class HardwareTNL : public TNL { >> // The hardware version. Note the _inline_ >> inline void Coord3f(x, y, z) { >> _add_vertex_to_DMA_buffer(x, y, z); >> } >>}; >> >>and then the C-callable versions for the glapi disptach table: >> >>void softwareCoord3f(GLcontext *ctx, GLfloat x, GLfloat y, GLfloat z) { >> Driver::Context *context =3D ctx; >> Driver::SoftwareTNL &tnl =3D ctx->tnl; >> >> // There will be no call as the function will be expanded inline >> tnl.Coord3F(x, y, z); >>} >=20 >=20 > Here you're converting a GLcontext * to a Driver::Context *. Can you do > that because Mesa::Context has GLcontext as first member? Anyway, if > that didn't work you could always do some fancy pointer arithmetics wit= h > the offset of the GLcontext in Driver::Context. There's a slight misconception happening here -- the 'ctx' argument doesn= 't=20 exist. The function should read something like: void swv3f( GLfloat x, GLfloat y, GLfloat z ) { GET_CONTEXT_FROM_THREAD_LOCAL_STORE( ctx ) // get tnl somehow =09 tnl->v3f( x, y, z ) } But why bother? tnl->v3f isn't virtual, so what's the point of having th= e=20 implementation somewhere else? Why not just do it here? Keith |
From: F. <jrf...@tu...> - 2003-03-11 01:06:42
|
On Mon, Mar 10, 2003 at 11:50:23PM +0000, Keith Whitwell wrote: > There's a slight misconception happening here -- the 'ctx' argument doesn't > exist. The function should read something like: > > void swv3f( GLfloat x, GLfloat y, GLfloat z ) > { > GET_CONTEXT_FROM_THREAD_LOCAL_STORE( ctx ) > // get tnl somehow > > tnl->v3f( x, y, z ) > } > > But why bother? tnl->v3f isn't virtual, so what's the point of having the > implementation somewhere else? Why not just do it here? That's right. But you can make it a static method though: static void RadeonTNL::v3f( GLfloat x, GLfloat y, GLfloat z ) { GET_CONTEXT_FROM_THREAD_LOCAL_STORE( ctx ) // get tnl somehow // do your thing here... } If the "*this" pointer isn't passed by any of the arguments, then using a static method is mostly for convenience and aesthetics. This you can inherit an unmodified version from a parent class, access the tnl private data, and the collections of callbacks can be managed as a whole by the TNL classes or its children. José Fonseca __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com |
From: F. <jrf...@tu...> - 2003-03-11 00:02:50
|
On Tue, Mar 11, 2003 at 12:33:53AM +0100, Felix Kühling wrote: > On Mon, 10 Mar 2003 22:23:07 +0000 > José Fonseca <jrf...@tu...> wrote: > [snip] > > As I said above this can be done in C++, and without damage to > > efficiency. > > > > Imagine you have a TnL abstract class: > > > > class TNL { > > // A OpenGL function > > virtual void Coord3f(GLfloat x, GLfloat y, GLfloat z) = 0; > > > > // Activate > > virtual void activate() = 0; > > > > protected: > > struct dispatch_table *my_dispatch_table; > > } ; > > > > and then you have two inherited classes for software and hardware > > rendering: > > > > class SoftwareTNL : public TNL { > > // The software version. Note the _inline_ > > inline void Coord3f(x, y, z) { > > _mesa_swrast_deal_with_this_vertex(x, y, z); > > } > > }; > > > > class HardwareTNL : public TNL { > > // The hardware version. Note the _inline_ > > inline void Coord3f(x, y, z) { > > _add_vertex_to_DMA_buffer(x, y, z); > > } > > }; > > > > and then the C-callable versions for the glapi disptach table: > > > > void softwareCoord3f(GLcontext *ctx, GLfloat x, GLfloat y, GLfloat z) { > > Driver::Context *context = ctx; > > Driver::SoftwareTNL &tnl = ctx->tnl; > > > > // There will be no call as the function will be expanded inline > > tnl.Coord3F(x, y, z); > > } > > Here you're converting a GLcontext * to a Driver::Context *. Can you do > that because Mesa::Context has GLcontext as first member? Anyway, if > that didn't work you could always do some fancy pointer arithmetics with > the offset of the GLcontext in Driver::Context. That one was a simplification, and not an exact representation of what should be done, which is: Driver::Context *context = (Driver::Context *)ctx->DriverCtx; That is, the driver private context pointer in GLcontext points to the C++ class for the driver context. But I could have chosen to derive Mesa::Context from GLcontext. See for example the Framebuffer class in http://jrfonseca.dyndns.org/projects/dri/cpp/mesa.hxx . I may eventually choose to do the same thing for GLcontext, but it's not very important right now. > But then, if the GLcontext is a member (not a pointer) in Mesa::Context, > then it would have to be created by the constructor of Mesa::Context. > Just checked in radeonCreateContext, the GLcontext is really created > there by calling _mesa_create_context. But _mesa_create_context returns > a pointer to the GLcontext. You would need a _mesa_create_context which > takes a GLcontext pointer as a parameter. This is something quite nice about mesa: for some of its structures you have the choice to dinamically allocate them, or simply initalize, i.e., you can choose _mesa_create_context or simply _mesa_initialize_context which does what you describe. > Maybe I'm getting lost in details, but IME it's better to stumble sooner > than later. Nop. You didn't lost, and the way we do it is not that irrelevent as it may seem, as the driver will definitely need to access the Mesa's GLcontext data structure. Composition and/or inheritence is preferable to pointers since they avoid the extra redirection. > Disclaimer: I support the idea of a C++ framework for DRI > driver development. ;-) :-) All feedback is welcome, even of "non C++ believers", as long as the arguments focus specific problems with the C++ approach, and not the language alone. José Fonseca __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com |
From: Marcelo E. M. <mma...@de...> - 2003-03-10 23:43:27
|
On Mon, Mar 10, 2003 at 10:23:07PM +0000, Jos=E9 Fonseca wrote: > struct function_table { > ... > void (*BlendFunc)(GLcontext *ctx, GLenum sfactor, GLenum dfactor); > ... > } driver; >=20 > and >=20 > class Context { > ... > void BlendFunc(GLenum sfactor, GLenum dfactor); > ... > } ; >=20 > You can't simply do >=20 > driver.BlendFunc =3D Context::BlendFunc; No, you can't do that. The problem is not the rhs as you seem to think but the lhs. The rhs needs to be &Context::BlendFunc. What type does that have? void (Context::*BlendFunc)(GLcontext *ctx, ...) > As I said above this can be done in C++, and without damage to > efficiency. I doubt that. (JFTR, if I'm given the choice of programming in C++ or C, I'll pick C++. I'm really not being a C zealot) > class TNL { > // A OpenGL function > virtual void Coord3f(GLfloat x, GLfloat y, GLfloat z) =3D 0; ^^^^^^^ pure virtual method. Depending on what you want to do with this, this will incur in a function call overhead, inline or not. Marcelo |
From: Keith W. <ke...@tu...> - 2003-03-10 23:59:13
|
Marcelo E. Magallon wrote: > On Mon, Mar 10, 2003 at 10:23:07PM +0000, Jos=E9 Fonseca wrote: >=20 > > struct function_table { > > ... > > void (*BlendFunc)(GLcontext *ctx, GLenum sfactor, GLenum dfactor); > > ... > > } driver; > >=20 > > and > >=20 > > class Context { > > ... > > void BlendFunc(GLenum sfactor, GLenum dfactor); > > ... > > } ; > >=20 > > You can't simply do > >=20 > > driver.BlendFunc =3D Context::BlendFunc; >=20 > No, you can't do that. The problem is not the rhs as you seem to thin= k > but the lhs. The rhs needs to be &Context::BlendFunc. What type does > that have? void (Context::*BlendFunc)(GLcontext *ctx, ...) >=20 > > As I said above this can be done in C++, and without damage to > > efficiency. >=20 > I doubt that. (JFTR, if I'm given the choice of programming in C++ or > C, I'll pick C++. I'm really not being a C zealot) >=20 > > class TNL { > > // A OpenGL function > > virtual void Coord3f(GLfloat x, GLfloat y, GLfloat z) =3D 0; >=20 > ^^^^^^^ > pure virtual method. Depending on what you want to do with this, > this will incur in a function call overhead, inline or not. Unless C++ can figure out at compile time *exactly* which class is being=20 invoked, I think... It's hard to know. Anyway, I don't think Jose really wanted a virtual member here, as he=20 definitely seems to want the only usage of these functions to be expanded= =20 inline, right? If so, the virtual keyword is unneccessary -- isn't it? Keith |
From: Marcelo E. M. <mma...@de...> - 2003-03-11 00:18:48
|
On Mon, Mar 10, 2003 at 11:59:05PM +0000, Keith Whitwell wrote: > > > class TNL { > > > // A OpenGL function > > > virtual void Coord3f(GLfloat x, GLfloat y, GLfloat z) = 0; > > > > ^^^^^^^ > > pure virtual method. Depending on what you want to do with this, > > this will incur in a function call overhead, inline or not. > > Unless C++ can figure out at compile time *exactly* which class is being > invoked, I think... It's hard to know. It can know it if it knows the type of the object, e.g. TNL tnl; tnl.Coord3f(x, y, z); or: HardwareTNL *tnl; /* or whatever you called this before */ tnl->Coord3f(x, y, z); but in: foo(TNL *tnl) { tnl->Coord3f(x, y, z); } the call definitely can't be inlined because the compiler doesn't know the "real" type of tnl. Here is where virtual enters the game. Do you really want to call TNL's Coord3f in foo or do you want to call whatever is appropiate for the type of the parameter? If the later, the member has to be virtual. Marcelo |
From: F. <jrf...@tu...> - 2003-03-11 00:17:46
|
On Tue, Mar 11, 2003 at 12:42:26AM +0100, Marcelo E. Magallon wrote: > On Mon, Mar 10, 2003 at 10:23:07PM +0000, José Fonseca wrote: > > > struct function_table { > > ... > > void (*BlendFunc)(GLcontext *ctx, GLenum sfactor, GLenum dfactor); > > ... > > } driver; > > > > and > > > > class Context { > > ... > > void BlendFunc(GLenum sfactor, GLenum dfactor); > > ... > > } ; > > > > You can't simply do > > > > driver.BlendFunc = Context::BlendFunc; > > No, you can't do that. The problem is not the rhs as you seem to think > but the lhs. The rhs needs to be &Context::BlendFunc. What type does > that have? void (Context::*BlendFunc)(GLcontext *ctx, ...) I know that sintatically that isn't allowed, but with a cast, and an appropriate ABI, may it could, but it would be very bad practice and we'd regret later. > > As I said above this can be done in C++, and without damage to > > efficiency. > > I doubt that. (JFTR, if I'm given the choice of programming in C++ or > C, I'll pick C++. I'm really not being a C zealot) > > > class TNL { > > // A OpenGL function > > virtual void Coord3f(GLfloat x, GLfloat y, GLfloat z) = 0; > > ^^^^^^^ > pure virtual method. Depending on what you want to do with this, > this will incur in a function call overhead, inline or not. See http://www.parashift.com/c++-faq-lite/value-vs-ref-semantics.html#faq-31.6 . It will incur in a function call overhead only if it's a pointer/reference to an abstract class. If using a template for the callback function then that won't happen. Of course that this behavior is compiler dependent and has to be checked in gcc. José Fonseca __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com |
From: Keith W. <ke...@tu...> - 2003-03-10 22:36:24
|
Jos=E9 Fonseca wrote: > On Mon, Mar 10, 2003 at 09:11:06PM +0000, Keith Whitwell wrote: >=20 >>Jos=E9 Fonseca wrote: >> >>>What disptach table would you be referring to, glapi or the TnL=20 >>>one? The >>>problem with disptach tables is that they completely break the OOP >>>concept as they work with regular functions instead of object=20 >>>methods. >> >>That's a problem with the OOP concept, then. Techniques based=20 >>around switching and updating dispatch tables are *the* way to do=20 >>fast GL drivers. >=20 >=20 > My initial worry was that it's not safe (someone *please* correct me if > I'm wrong) to put a C++ method in a C function callback, i.e., if you > have: >=20 > struct function_table { > ... > void (*BlendFunc)(GLcontext *ctx, GLenum sfactor, GLenum dfactor); > ... > } driver; >=20 > and >=20 > class Context { > ... > void BlendFunc(GLenum sfactor, GLenum dfactor); > ... > } ; >=20 > You can't simply do >=20 > driver.BlendFunc =3D Context::BlendFunc; >=20 > or can you? No, because one of the things C++ does is pass around an extra parameter = --=20 namely the 'self' pointer. The 'real' prototype looks something like: void Context::BlendFunc( Context *self, GLenum sfactor, GLenum dfactor ) -- but there's no guarentee that this is actually what is happening, or t= hat=20 it won't change. Yes, I know there is an ABI now -- but I've no idea wha= t it=20 actually specifies. Does it allow the compiler to try & figure out if se= lf is=20 needed? If BlendFunc doesn't reference it, does it go away, or is it alw= ays=20 passed even if its not needed? > Anyway, after I fully understood what you're proposing I > realized this can be easily overcomed and even made easier with > OOP + templates. [snip] > This is great. >=20 > As I said above this can be done in C++, and without damage to > efficiency. Except that the functions plugged into the dispatch table cannot be C++=20 methods, as the prototypes are defined, and don't include the magic C++ s= elf=20 pointer. I've deleted the rest of your explanation as unfortunately I don't think = it=20 can be made to work. The principles of abstraction, inheritance etc are great, and can be used= in=20 efficient GL driver development, but at this level the calling convention= is=20 fixed and so the language might as well be C. You can do all the stuff y= ou're=20 talking about by selectively updating dispatch tables, either the top lev= el=20 one in libGL.so, or task-specific internal ones inside the driver. This is something that C++ does internally, but for GL driver development= you=20 are probably better off doing it explicitly, as you have to at the libGL.= so=20 layer anyway. And really, when you're doing the *real* critical code, such as the tnl=20 modules, you definitely want to be doing runtime codegen anyway. Abstrac= ting=20 this between the drivers is certainly possible as big chunks of what we d= o=20 there is machine independent. Keith |
From: F. <jrf...@tu...> - 2003-03-10 23:07:49
|
On Mon, Mar 10, 2003 at 10:36:21PM +0000, Keith Whitwell wrote: > No, because one of the things C++ does is pass around an extra parameter -- > namely the 'self' pointer. The 'real' prototype looks something like: > > void Context::BlendFunc( Context *self, GLenum sfactor, GLenum > dfactor ) I know this, that's why I thought there could be a remote chance to make it work, as the arguments mostly match, i.e., ctx = self. > -- but there's no guarentee that this is actually what is happening, or > that it won't change. Yes, I know there is an ABI now -- but I've no idea > what it actually specifies. That's my doubt too. > >As I said above this can be done in C++, and without damage to > >efficiency. > > Except that the functions plugged into the dispatch table cannot be C++ > methods, as the prototypes are defined, and don't include the magic C++ > self pointer. But the function I put in the table _was_ an ordinary function, and not a C++ method, and no redirection call would take place with inlining. That was the point of the explanation... > I've deleted the rest of your explanation as unfortunately I don't think it > can be made to work. I've deleted the rest of your comment as unfortunately I don't agree with any of it. But I've read it. The principles of doing everything explicitly to achieve better performance are great etc, but at this point we don't have the manpower on DRI to do this kind of development. This is something that is better to rely on C++ to do it for us, as _much_ as possible, even knowing that you could achieve better with explicit C or even assembly coding. This is a burden of higher-level languages. My main priority with the C++ framework is first to _speed_ development. Most of the performance enhancements will be a side-effect for having more free time to optimizate, and being able to do it in a driver-inpendent fashion. And yes, I do believe that, with care, C++ won't create a noticeable overhead, but there's no point to discuss it, as we can simply benchmark it on the end. José Fonseca __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com |
From: Keith W. <ke...@tu...> - 2003-03-10 23:13:14
|
Jos=E9 Fonseca wrote: > On Mon, Mar 10, 2003 at 10:36:21PM +0000, Keith Whitwell wrote: >=20 >>No, because one of the things C++ does is pass around an extra paramete= r --=20 >>namely the 'self' pointer. The 'real' prototype looks something like: >> >> void Context::BlendFunc( Context *self, GLenum sfactor, GLenum=20 >> dfactor ) >=20 >=20 > I know this, that's why I thought there could be a remote chance to mak= e > it work, as the arguments mostly match, i.e., ctx =3D self. >=20 >=20 >>-- but there's no guarentee that this is actually what is happening, or= =20 >>that it won't change. Yes, I know there is an ABI now -- but I've no i= dea=20 >>what it actually specifies.=20 >=20 >=20 > That's my doubt too. >=20 >=20 >>>As I said above this can be done in C++, and without damage to >>>efficiency. >> >>Except that the functions plugged into the dispatch table cannot be C++= =20 >>methods, as the prototypes are defined, and don't include the magic C++= =20 >>self pointer. >=20 >=20 > But the function I put in the table _was_ an ordinary function, and not > a C++ method, and no redirection call would take place with inlining. > That was the point of the explanation... >=20 >=20 >>I've deleted the rest of your explanation as unfortunately I don't thin= k it=20 >>can be made to work. >=20 >=20 > I've deleted the rest of your comment as unfortunately I don't agree > with any of it. But I've read it. Fair enough. I missed your point. Keith |
From: F. <jrf...@tu...> - 2003-03-10 23:43:04
|
Jon Smirl, I took the liberty of CC'ing the lists again as it is a very valid point you make here. On Mon, Mar 10, 2003 at 03:17:56PM -0800, Jon Smirl wrote: > --- José Fonseca <jrf...@tu...> > wrote: > > On Mon, Mar 10, 2003 at 10:36:21PM +0000, Keith > > Whitwell wrote: > > > No, because one of the things C++ does is pass > > around an extra parameter -- > > > namely the 'self' pointer. The 'real' prototype > > looks something like: > > > > > > void Context::BlendFunc( Context *self, GLenum > > sfactor, GLenum > > > dfactor ) > > > > I know this, that's why I thought there could be a > > remote chance to make > > it work, as the arguments mostly match, i.e., ctx = > > self. > > > > I wasn't really paying attention to this, but don't > you just want to do this, compiled with c++.... > > _cdecl handler(Context* self, sfactor, dfactor) > { > self->(sfactor, dfactor); > } > > _cdecl will export the function as an non-name mangled > C entry point. _cdecl is from windows, I don't know > what the g++ equivalent is. You can probably inline > this and make it completely go away when compiled. Yes, this works as you say _if_ the method isn't virtual, or at least the exact type of the class is known at compile time, i.e., it's not an abstract Context *, but actually a non-abstract RadeonContext *. Unfortunately most interesting cases, you desired that method to be virtual. This means that if you really want to eliminate the call you need to write a different "_cdecl handler" for each of the inherited classes and update the function table in run-time. Or at least use a template to write all of them for you. I've never though of this [and e.g. all mesa wrappers so far have that extra redirection call], until Keith said that this [of updating the funcion table in run-time] was something he desired to use more extensively. Nevertheless, this solution will require the extensive use of templates and it won't be pretty, so I'll leave it to when the framework stabilizes. José Fonseca __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com |
From: Jon S. <jon...@ya...> - 2003-03-10 23:56:19
|
--- José Fonseca <jrf...@tu...> wrote: > Yes, this works as you say _if_ the method isn't > virtual, or at least > the exact type of the class is known at compile > time, i.e., it's not an > abstract Context *, but actually a non-abstract > RadeonContext *. It works for virtual methods and abstract classes. Check out a description of how C++ VTables work. This is a very integral part of how COM/XPCOM work too. ===== Jon Smirl jon...@ya... __________________________________________________ Do you Yahoo!? Yahoo! Tax Center - forms, calculators, tips, more http://taxes.yahoo.com/ |
From: Keith W. <ke...@tu...> - 2003-03-11 00:01:19
|
Jon Smirl wrote: > --- Jos=E9 Fonseca <jrf...@tu...> > wrote: >=20 >>Yes, this works as you say _if_ the method isn't >>virtual, or at least >>the exact type of the class is known at compile >>time, i.e., it's not an >>abstract Context *, but actually a non-abstract >>RadeonContext *. >=20 >=20 > It works for virtual methods and abstract classes. > Check out a description of how C++ VTables work. This > is a very integral part of how COM/XPCOM work too. >=20 I didn't really understand where this was coming from. I think it might=20 relate to the earlier misunderstanding that the GL dispatch tables provid= e a=20 'ctx' argument (which might be a C++ object). This isn't true -- dispatc= h=20 doesn't add any new arguments, so this trick probably doesn't help. Keith |
From: F. <jrf...@tu...> - 2003-03-11 00:51:16
|
On Tue, Mar 11, 2003 at 12:01:16AM +0000, Keith Whitwell wrote: > Jon Smirl wrote: > >--- José Fonseca <jrf...@tu...> > >wrote: > > > >>Yes, this works as you say _if_ the method isn't > >>virtual, or at least > >>the exact type of the class is known at compile > >>time, i.e., it's not an > >>abstract Context *, but actually a non-abstract > >>RadeonContext *. > > > > > >It works for virtual methods and abstract classes. > >Check out a description of how C++ VTables work. This > >is a very integral part of how COM/XPCOM work too. > > > > I didn't really understand where this was coming from. I think it might > relate to the earlier misunderstanding that the GL dispatch tables provide > a 'ctx' argument (which might be a C++ object). This isn't true -- > dispatch doesn't add any new arguments, so this trick probably doesn't help. Sorry, my fault... the Mesa dd_function_table has them, and somehow I though the glapi did the same (by picking the current ctx before calling). This completely eliminates any doubts regarding the possibility of passing methods. On the other hand, for most cases in _glapi we can simply pass them static method pointers and get the current ctx via the gpapi. Eliminating the need of templates for that. Anyway, the solution I presented here may be usefull for other functions tables in mesa, such as for the software tnl module. But now moving the discussion forward, concerning giving more control over the glapi table to the drivers (instead of mesa). If I understood correctly, the idea is having one (or more) function tables per context. But then how things such as multi-threading and context switches work? Are there any provision in glapi for this? José Fonseca __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com |
From: Keith W. <ke...@tu...> - 2003-03-11 01:04:01
|
Jos=E9 Fonseca wrote: > On Tue, Mar 11, 2003 at 12:01:16AM +0000, Keith Whitwell wrote: >=20 >>Jon Smirl wrote: >> >>>--- Jos=E9 Fonseca <jrf...@tu...> >>>wrote: >>> >>> >>>>Yes, this works as you say _if_ the method isn't >>>>virtual, or at least >>>>the exact type of the class is known at compile >>>>time, i.e., it's not an >>>>abstract Context *, but actually a non-abstract >>>>RadeonContext *. >>> >>> >>>It works for virtual methods and abstract classes. >>>Check out a description of how C++ VTables work. This >>>is a very integral part of how COM/XPCOM work too. >>> >> >>I didn't really understand where this was coming from. I think it migh= t=20 >>relate to the earlier misunderstanding that the GL dispatch tables prov= ide=20 >>a 'ctx' argument (which might be a C++ object). This isn't true --=20 >>dispatch doesn't add any new arguments, so this trick probably doesn't = help. >=20 >=20 > Sorry, my fault... the Mesa dd_function_table has them, and somehow I t= hough > the glapi did the same (by picking the current ctx before calling). >=20 > This completely eliminates any doubts regarding the possibility of > passing methods.=20 >=20 > On the other hand, for most cases in _glapi we can simply pass them > static method pointers and get the current ctx via the gpapi. > Eliminating the need of templates for that. >=20 > Anyway, the solution I presented here may be usefull for other function= s > tables in mesa, such as for the software tnl module. >=20 >=20 > But now moving the discussion forward, concerning giving more control > over the glapi table to the drivers (instead of mesa). If I understood > correctly, the idea is having one (or more) function tables per context. > But then how things such as multi-threading and context switches work? > Are there any provision in glapi for this? Display list compilation is currently the only time we switch tables -- l= ook=20 at _mesa_NewList() in dlist.c. It's a straightforward operation, complic= ated=20 only slightly by threading concerns. The work I've been doing in the vtx-0-1-branch will make greater use of=20 dispatch table switching. Keith |
From: Felix <fx...@gm...> - 2003-03-11 00:13:24
|
On Mon, 10 Mar 2003 23:43:37 +0000 José Fonseca <jrf...@tu...> wrote: > Jon Smirl, > > I took the liberty of CC'ing the lists again as it is a very valid point > you make here. > > On Mon, Mar 10, 2003 at 03:17:56PM -0800, Jon Smirl wrote: [snip] > > _cdecl handler(Context* self, sfactor, dfactor) > > { > > self->(sfactor, dfactor); > > } > > > > _cdecl will export the function as an non-name mangled > > C entry point. _cdecl is from windows, I don't know > > what the g++ equivalent is. You can probably inline > > this and make it completely go away when compiled. I don't know about _cdecl, but what you're describing sounds to me like declaring: extern "C" void handler (Context* self, sfactor, dfactor); ------------ __\|/__ ___ ___ ------------------------- Felix ___\_e -_/___/ __\___/ __\_____ You can do anything, Kühling (_____\Ä/____/ /_____/ /________) just not everything fx...@gm... \___/ \___/ U at the same time. |
From: Marcelo E. M. <mar...@bi...> - 2003-03-11 00:03:14
|
On Mon, Mar 10, 2003 at 11:08:23PM +0000, Jos=E9 Fonseca wrote: > But the function I put in the table _was_ an ordinary function, and > not a C++ method, and no redirection call would take place with > inlining. That was the point of the explanation... Uhm... how do you inline a function call that's going over a function pointer? Specifically, ask youself what's involved in inlining a function. > And yes, I do believe that, with care, C++ won't create a noticeable > overhead, but there's no point to discuss it, as we can simply > benchmark it on the end. There's probably a way to make this fly, but the way you have shown until now isn't it. You could for example have a dispatch _object_, that is, you end up calling: dispatch->BlendFunc(...); and you can switch the dispatch object. You are still calling the method thru a pointer. You could probably fix that with some clever use of function templates but I don't have a clear picture how software fallbacks could (efficiently) work in that case. Marcelo |
From: Keith W. <ke...@tu...> - 2003-03-11 00:27:42
|
Marcelo E. Magallon wrote: > On Mon, Mar 10, 2003 at 11:08:23PM +0000, Jos=E9 Fonseca wrote: >=20 > > But the function I put in the table _was_ an ordinary function, and > > not a C++ method, and no redirection call would take place with > > inlining. That was the point of the explanation... >=20 > Uhm... how do you inline a function call that's going over a function > pointer? Specifically, ask youself what's involved in inlining a > function. >=20 > > And yes, I do believe that, with care, C++ won't create a noticeable > > overhead, but there's no point to discuss it, as we can simply > > benchmark it on the end. >=20 > There's probably a way to make this fly, but the way you have shown > until now isn't it. You could for example have a dispatch _object_, > that is, you end up calling: >=20 > dispatch->BlendFunc(...); >=20 > and you can switch the dispatch object. You are still calling the > method thru a pointer. You could probably fix that with some clever > use of function templates but I don't have a clear picture how softwar= e > fallbacks could (efficiently) work in that case. libGL.so provides a dispatch table that can be efficiently switched. The= real=20 'gl' entrypoints basically just look up an offset in this table and jump = to=20 it. No new arguments, no new stack frame, nada -- just an extremely effi= cient=20 jump. Note that this is the library entrypoint, so we can't ask the call= er to=20 use a function pointer instead: =09 00041930 <glBlendFunc>: 41930: a1 00 00 00 00 mov 0x0,%eax 41935: ff a0 c4 03 00 00 jmp *0x3c4(%eax) unfortunately, that version isn't threadsafe, but Gareth is relentlessly=20 persuing an efficient threadsafe equivalent. Given that this mechanism is in place already, it makes sense to use it.=20 Which also means that there isn't much need for virtual methods in decidi= ng=20 which version of BlendFunc (or more relevantly, Vertex3f) to use -- you'v= e=20 already got a virtualization mechanism right here. At lower level, you might need to virtualize again. Currently mesa and t= he=20 drivers handle that with C function pointers, but classes with virtual me= thods=20 could be substituted. Keith |
From: Marcelo E. M. <mma...@de...> - 2003-03-11 00:47:18
|
On Tue, Mar 11, 2003 at 12:27:40AM +0000, Keith Whitwell wrote: > libGL.so provides a dispatch table that can be efficiently switched. Yep, I've read how that's implemented in Mesa. It's really nice once you figure out what's going on (the code of course looks a look more terrifying than your explaination). I was confused about Jos=E9's intention (probably because I read the files he mentioned). In fact, after reading the rest of you mail, I'm still confused... I think... > The real 'gl' entrypoints basically just look up an offset in this > table and jump to it. No new arguments, no new stack frame, nada -- > just an extremely efficient jump. Note that this is the library > entrypoint, so we can't ask the caller to use a function pointer > instead: yes, that's nice and I agree that it should be reused wherever possible. > unfortunately, that version isn't threadsafe, but Gareth is > relentlessly persuing an efficient threadsafe equivalent. because there's nothing preventing one thread from writing over the same memory that another thread is accessing? I fail to see how that relates to the way the functions are called. > Given that this mechanism is in place already, it makes sense to use > it. Which also means that there isn't much need for virtual methods > in deciding which version of BlendFunc (or more relevantly, Vertex3f) > to use -- you've already got a virtualization mechanism right here. Sure. Haha! Now I really understand what you meant before by updating the dispatch tables. (Previously I just thought I had understood it, I was left wondering why it was so special) Marcelo |