From: Vladimir D. <vo...@mi...> - 2005-05-19 04:58:11
|
Hi Aapo, Ben, Jerome, Nicolai: I recently checked fresh code from CVS and was pleasantly surprised to see that all Quake3 levels that were broken are now perfect - in fact I cannot find anything that is amiss ! Do you think it would be a good idea to tag the current code and make a snapshot ? thank you ! Vladimir Dergachev |
From: Keith W. <ke...@tu...> - 2005-05-19 07:20:51
|
Vladimir Dergachev wrote: > > Hi Aapo, Ben, Jerome, Nicolai: > > I recently checked fresh code from CVS and was pleasantly surprised > to see that all Quake3 levels that were broken are now perfect - in fact > I cannot find anything that is amiss ! > > Do you think it would be a good idea to tag the current code and make > a snapshot ? So have you guys given any consideration to moving the r300 driver into mesa proper? CVS access shouldn't be a problem, fwiw... Keith |
From: Jerome G. <j.g...@gm...> - 2005-05-19 08:00:17
|
On 5/19/05, Keith Whitwell <ke...@tu...> wrote: > Vladimir Dergachev wrote: > > > > Hi Aapo, Ben, Jerome, Nicolai: > > > > I recently checked fresh code from CVS and was pleasantly surprised > > to see that all Quake3 levels that were broken are now perfect - in fac= t > > I cannot find anything that is amiss ! > > > > Do you think it would be a good idea to tag the current code and mak= e > > a snapshot ? Why not :) > So have you guys given any consideration to moving the r300 driver into > mesa proper? CVS access shouldn't be a problem, fwiw... I think to few of us have an access to mesa cvs (at leat i didn't have one)= , anyway i could ask one. But there is still missing parts. I would like to know if anyone know what is still not working and thus do a to do list of that... What i see missing is : (i may not see everythings :-) -deeper testing of tcl program generated with mesa -tex env -fragment program Does ... work ? -z offset=20 -stencil Right now i am on pixel shader after doing some test i don't think we can use a similar stuff like i915 emit arithm, i915 hardware are far more easie= r to program than r300. I am coding another approach an hope to have it done by the end of this week. Moreover i see that 9800 are reported to crash with the driver ? Is this still true ? Jerome Glisse |
From: Ben S. <dar...@ii...> - 2005-05-19 13:53:44
Attachments:
fragprog.diff
|
Jerome Glisse wrote: >On 5/19/05, Keith Whitwell <ke...@tu...> wrote: > > >>Vladimir Dergachev wrote: >> >> >>>Hi Aapo, Ben, Jerome, Nicolai: >>> >>> I recently checked fresh code from CVS and was pleasantly surprised >>>to see that all Quake3 levels that were broken are now perfect - in fact >>>I cannot find anything that is amiss ! >>> >>> Do you think it would be a good idea to tag the current code and make >>>a snapshot ? >>> >>> > > > Sounds like a good idea. >Why not :) > > > >>So have you guys given any consideration to moving the r300 driver into >>mesa proper? CVS access shouldn't be a problem, fwiw... >> >> > >I think to few of us have an access to mesa cvs (at leat i didn't have one), >anyway i could ask one. But there is still missing parts. I would like to >know if anyone know what is still not working and thus do a to do list >of that... > >What i see missing is : (i may not see everythings :-) > >-deeper testing of tcl program generated with mesa >-tex env >-fragment program > > Also, I think there's still some weirdness with a couple of texture formats, namely GL_ALPHA and GL_LUMINANCE_ALPHA. This is clearly seen in Mesa/progs/demos/texenv.c. Someone, I believe it was Aapo, said that they see white lines across the screen when the framerate is fairly high. I didn't see this up until yesterday when I had to change from my 9600pro to a 9600XT (I killed the card moving it between machines somehow). >Does ... work ? >-z offset >-stencil > >Right now i am on pixel shader after doing some test i don't think we can >use a similar stuff like i915 emit arithm, i915 hardware are far more easier >to program than r300. I am coding another approach an hope to have it >done by the end of this week. > > I've also been working on some fragment program stuff. I have attached what I've done so far, which works quite well with Keith's texenv program generation that's in Mesa cvs. Not all arb_f_p opcodes are implemented, but I think everything's there that the texenv stuff needs. I was planning on commiting this soon, but you may have a better approach than I took so I'll wait a bit. Ben Skeggs. >Moreover i see that 9800 are reported to crash with the driver ? Is this >still true ? > >Jerome Glisse > > >------------------------------------------------------- >This SF.Net email is sponsored by Oracle Space Sweepstakes >Want to be the first software developer in space? >Enter now for the Oracle Space Sweepstakes! >http://ads.osdn.com/?ad_idt12&alloc_id344&op=click >-- >_______________________________________________ >Dri-devel mailing list >Dri...@li... >https://lists.sourceforge.net/lists/listinfo/dri-devel > > > > |
From: Jerome G. <j.g...@gm...> - 2005-05-19 15:26:26
|
> I've also been working on some fragment program stuff. I have attached w= hat > I've done so far, which works quite well with Keith's texenv program > generation > that's in Mesa cvs. Not all arb_f_p opcodes are implemented, but I > think everything's > there that the texenv stuff needs. >=20 > I was planning on commiting this soon, but you may have a better > approach than I > took so I'll wait a bit. I have tried a similar approach at first but as swizzle appear it looks awful, i haven't take a deeper look at the way you handle it but you have a bunch of if test like i got in my first version. You certainly have done cleaner version than my first attempt thought. Anyway now what i have is the stupid way of having a table for all swizzle case, just for the x,y,z component as w is in sep an easier reg. Thus i got two tab one for xyz to FPI0_SRC conv which have a special value for not native format, this special value in an index in the second tab where i got all the code for all differents cases (64 cases less 7 of native format). This consumes few bytes about 1ko... Thus what we may do is use this table or use your swizzle function. Table lookup are faster but we doesn't have to translate code often, no ? Anyway other part of the code is basicly the same, yours is in a more advanced state than mine thus we should use yours... Anyone have strong feeling on this swizzle thing ? Or even an magic idea with a magic bitwise formula :) One other interesting feature is reuse of temp register at some point i was thinking of adding a free_temp function and having a stack of free register. Then get_temp first take from the stack if not empty, or use a new temp if empty. This get/free approach is usefull with swizzling where you know that you will use a temps register only for 1 instruction. But there may be case where you reuse the same swizzled src. I am wondering if you emit_const_swizzle may not consume all const if say you have an const which is used in all of the 64 possibles swizzle... I may have miss something on that thought. But if we know that we will only use the const in a swizzled format then your solution is the best. I got other idea on optimizing this swizzling in program translation (basicly tracking operand). But this are maybe useless optimization for time being... Jerome Glisse |
From: Ben S. <dar...@ii...> - 2005-05-19 15:49:18
|
Jerome Glisse wrote: >>I've also been working on some fragment program stuff. I have attached what >>I've done so far, which works quite well with Keith's texenv program >>generation >>that's in Mesa cvs. Not all arb_f_p opcodes are implemented, but I >>think everything's >>there that the texenv stuff needs. >> >>I was planning on commiting this soon, but you may have a better >>approach than I >>took so I'll wait a bit. >> >> > >I have tried a similar approach at first but as swizzle appear it >looks awful, i haven't take a deeper look at the way you handle it >but you have a bunch of if test like i got in my first version. You >certainly have done cleaner version than my first attempt thought. > >Anyway now what i have is the stupid way of having a table >for all swizzle case, just for the x,y,z component as w is in >sep an easier reg. Thus i got two tab one for xyz to FPI0_SRC >conv which have a special value for not native format, this >special value in an index in the second tab where i got all the >code for all differents cases (64 cases less 7 of native format). >This consumes few bytes about 1ko... > > I use a similar approach. v_swiz contains all the native r300 swizzle values, aswell as a couple of cases where we have to handle them specially. The non-native cases have v_swiz->native set to GL_FALSE. What I do is firstly loop through checking for matches of XYZ. If we match all three components, the code will return a pfs_reg_t with v_swz/s_swz set to a native type. Otherwise, we continue looping through the different outmasks doing multiple emits to another temp until we have the desired value. >Thus what we may do is use this table or use your swizzle >function. Table lookup are faster but we doesn't have to translate >code often, no ? > > The code is only translated once. In the case of the texenv stuff, whenever it needs to be regenerated Mesa will call r300ProgramStringNotify to tell us that the program has changed. >Anyway other part of the code is basicly the same, yours is >in a more advanced state than mine thus we should use yours... > >Anyone have strong feeling on this swizzle thing ? >Or even an magic idea with a magic bitwise formula :) > > > >One other interesting feature is reuse of temp register at some >point i was thinking of adding a free_temp function and having a >stack of free register. >Then get_temp first take from the stack if not empty, or use a >new temp if empty. > >This get/free approach is usefull with swizzling where you know >that you will use a temps register only for 1 instruction. But there >may be case where you reuse the same swizzled src. > > I once had this in the code, but ended up stripping it down to iron out some bugs. Now that I know that it mostly works correctly, I should be able to add this back in easily enough. Rather than using a stack, I'm using a bitfield to track register allocations. Freeing a register is as easy as setting the bit to 0. Care must be taken not to cause texture indirections by reusing an already used temp as the destination for a TEX instruction (that's what rp->used_this_node is/was for) >I am wondering if you emit_const_swizzle may not consume all >const if say you have an const which is used in all of the 64 >possibles swizzle... I may have miss something on that thought. > >But if we know that we will only use the const in a swizzled format >then your solution is the best. > > No, you haven't missed anything. This was another case where I quickly hacked something up. Constant swizzles should/could be handled exactly the same as temp swizzles. In the swizzling code you just have to be careful that reg->type is set correctly depending on whether or not it's a native swizzle. >I got other idea on optimizing this swizzling in program translation >(basicly tracking operand). But this are maybe useless optimization for >time being... > > I've thought of similar things, but haven't attempted anything yet. I also need to take another look at the generated code, to make sure that combining the xyz and w instuction streams is working as I intended. When I first integrated my test code into r300_dri it was, but I've changed much since then. Cheers, Ben Skeggs. >Jerome Glisse > > >------------------------------------------------------- >This SF.Net email is sponsored by Oracle Space Sweepstakes >Want to be the first software developer in space? >Enter now for the Oracle Space Sweepstakes! >http://ads.osdn.com/?ad_idt12&alloc_id344&op=click >-- >_______________________________________________ >Dri-devel mailing list >Dri...@li... >https://lists.sourceforge.net/lists/listinfo/dri-devel > > > > |
From: Jerome G. <j.g...@gm...> - 2005-05-19 16:57:42
|
> I use a similar approach. v_swiz contains all the native r300 swizzle > values, > aswell as a couple of cases where we have to handle them specially. The > non-native cases have v_swiz->native set to GL_FALSE. I saw that in code but you still got a loop and test case, your approach is well thinked. But i think it may be hard to understand say if you look at it in 2 or 3 month. Thus the only + arguments for a simple 2tab translation i propose is that it the easiest thing to understand, moreover you can easily optimize some swizzling case and don't bother too much on other... But beside the understanding of your mixed approach haven't a strong opinion on which solution is the best... =20 > The code is only translated once. In the case of the texenv stuff, whene= ver > it needs to be regenerated Mesa will call r300ProgramStringNotify to tell > us that the program has changed. This why i think tab lookup speed isn't really revealent in selecting this. I will try to adapt my swizzle function to your code (shouldn't be difficult) thus you can see it. Bascily arg checking in emit arith look like this : id =3D reg & MASK_XYZCHANNEL reg_fpi0_mask =3D tab1[id]; if (reg_fpi0_mask ^ ffe0 ) { swizzle -> copy tab2[id>>5] -> r300_instruct t =3D get_temp for i<tab2[id>>5].length r300_instr[p-i] |=3D t } Just a memcpy of instruction and a small loop to set the correct temp reg allocated. > I once had this in the code, but ended up stripping it down to iron out s= ome > bugs. Now that I know that it mostly works correctly, I should be able > to add > this back in easily enough. Rather than using a stack, I'm using a > bitfield to > track register allocations. Freeing a register is as easy as setting > the bit to 0. What is in favor of a stack is that you easily find free temp (just test if stack is empty), while with bitfield you have to loop trought each temp see if bitfield set. >Care must be taken not to cause texture indirections by reusing an > already > used temp as the destination for a TEX instruction (that's what > rp->used_this_node > is/was for) Yes, i saw this possible issue but didn't think too much on solution to handle it, i will give a deeper look to your code this evening. > No, you haven't missed anything. This was another case where I quickly > hacked > something up. Constant swizzles should/could be handled exactly the same= as > temp swizzles. In the swizzling code you just have to be careful that > reg->type is > set correctly depending on whether or not it's a native swizzle. I was thinking of tracking constant and see if they are always used swizzle. If so easier to emit the const swizzled like you do.=20 As i said this is maybe a to advanced and complex optimization which may involve complex tracking of operand in the program. =20 Jerome Glisse |
From: Ben S. <dar...@ii...> - 2005-05-19 17:28:54
Attachments:
fragprog2.diff
|
Jerome Glisse wrote: >>I use a similar approach. v_swiz contains all the native r300 swizzle >>values, >>aswell as a couple of cases where we have to handle them specially. The >>non-native cases have v_swiz->native set to GL_FALSE. >> >> > >I saw that in code but you still got a loop and test case, your >approach is well thinked. But i think it may be hard to understand >say if you look at it in 2 or 3 month. > >Thus the only + arguments for a simple 2tab translation i >propose is that it the easiest thing to understand, moreover >you can easily optimize some swizzling case and don't bother >too much on other... But beside the understanding >of your mixed approach haven't a strong opinion on which solution is the >best... > > > >>The code is only translated once. In the case of the texenv stuff, whenever >>it needs to be regenerated Mesa will call r300ProgramStringNotify to tell >>us that the program has changed. >> >> > >This why i think tab lookup speed isn't really revealent in >selecting this. I will try to adapt my swizzle function >to your code (shouldn't be difficult) thus you can see it. >Bascily arg checking in emit arith look like this : > >id = reg & MASK_XYZCHANNEL >reg_fpi0_mask = tab1[id]; >if (reg_fpi0_mask ^ ffe0 ) { > swizzle -> copy tab2[id>>5] -> r300_instruct > t = get_temp > for i<tab2[id>>5].length > r300_instr[p-i] |= t > } >Just a memcpy of instruction and a small loop >to set the correct temp reg allocated. > > > Well, I'm in mixed minds about this myself now. But if you have a way to gain extra speed out of this without consuming a heap of RAM, I'm all for it :) Extra speed could be useful, as some programs may have a lot of swizzles per-frame (UT2004 has a few when MaxTextureUnits is set to 8 in ut2004.ini) and every extra bit helps. > > >>Care must be taken not to cause texture indirections by reusing an >>already >>used temp as the destination for a TEX instruction (that's what >>rp->used_this_node >>is/was for) >> >> > >Yes, i saw this possible issue but didn't think too much on >solution to handle it, i will give a deeper look to your code >this evening. > > > I've attached another patch (applies on top of the last one) which /should/ take care of this, and handle the case where a program does a TEX directly to an output. It also free's up temps in the cases where we need one for a swizzle, or an LRP. What I'd like to be able to do is re-use temps used by the Mesa program, and also to free-up the hardware temp that a Mesa INPUT uses once they're no longer needed. We'd need to find out when the temp/input was last used so we didn't destroy it pre-maturely. Should be easy enough to do, as my code already pre-parses the Mesa program. I originally considered this a thing I needed to fix. But it could be useful for some things. >>No, you haven't missed anything. This was another case where I quickly >>hacked >>something up. Constant swizzles should/could be handled exactly the same as >>temp swizzles. In the swizzling code you just have to be careful that >>reg->type is >>set correctly depending on whether or not it's a native swizzle. >> >> > >I was thinking of tracking constant and see if they are always used >swizzle. If so easier to emit the const swizzled like you do. >As i said this is maybe a to advanced and complex optimization which >may involve complex tracking of operand in the program. > > You could possibly do this for some cases right now. In t_src you'd need to skip the emit_const4fv for constant sources if they're swizzles, and call swizzle_const instead. This would eliminate some of them. The patch I attached doesn't call swizzle_const at all, instead, it uses the same method as temps do. I didn't see anything too nasty in ut2004 from doing this. I didn't test the patch in great detail, so there's probably something I missed :/ Ben Skeggs. > >Jerome Glisse > > >------------------------------------------------------- >This SF.Net email is sponsored by Oracle Space Sweepstakes >Want to be the first software developer in space? >Enter now for the Oracle Space Sweepstakes! >http://ads.osdn.com/?ad_idt12&alloc_id344&op=click >-- >_______________________________________________ >Dri-devel mailing list >Dri...@li... >https://lists.sourceforge.net/lists/listinfo/dri-devel > > > > |
From: Boris P. <bo...@za...> - 2005-05-19 10:37:50
|
Jerome Glisse wrote: > > Moreover i see that 9800 are reported to crash with the driver ? Is this > still true ? > > Jerome Glisse > Yes, this is still very true. I've just rebuilt xorg, mesa and r300 from cvs. I tested with glxgears and a couple of games. I've got a 9800 pro. glxgears running alone doesn't crash for a long time, but using anything else in parallel (for example firefox) gives me a crash in seconds. Playing neverball or tuxracer doesn't seem to crash at all. armagetron on the other hand has a high possibility to crash, and ut2004demo crashes in no more than 10 seconds. In crash I mean that X, the mouse and the keyboard freeze. Maybe it has something to do with frame rates? Boris Peterbarg |
From: Jerome G. <j.g...@gm...> - 2005-05-19 11:06:19
|
On 5/19/05, Boris Peterbarg <bo...@za...> wrote: > Yes, this is still very true. I've just rebuilt xorg, mesa and r300 from > cvs. I tested with glxgears and a couple of games. I've got a 9800 pro. > glxgears running alone doesn't crash for a long time, but using anything > else in parallel (for example firefox) gives me a crash in seconds. > Playing neverball or tuxracer doesn't seem to crash at all. > armagetron on the other hand has a high possibility to crash, and > ut2004demo crashes in no more than 10 seconds. > In crash I mean that X, the mouse and the keyboard freeze. Did you try to log in via ssh ? If you can, look if the X process consume 100% of CPU, if so this may be related to the bug of r200, as we got many common code. =20 > Maybe it has something to do with frame rates? I got a 9800 (don't remember if it's a pro or not) i will try to see if i can find anything. Jerome Glisse |
From: Jonathan Bastien-F. <jo...@da...> - 2005-05-19 13:07:39
Attachments:
signature.asc
|
Jerome Glisse wrote: >Right now i am on pixel shader after doing some test i don't think we can >use a similar stuff like i915 emit arithm, i915 hardware are far more easier >to program than r300. I am coding another approach an hope to have it >done by the end of this week. > >Moreover i see that 9800 are reported to crash with the driver ? Is this >still true ? > > I will confirm this tomorrow, right now, I am busy with other stuff. Cheers, Jonathan |
From: Nicolai H. <pre...@gm...> - 2005-05-19 14:06:20
|
On Thursday 19 May 2005 09:20, Keith Whitwell wrote: > Vladimir Dergachev wrote: > >=20 > > Hi Aapo, Ben, Jerome, Nicolai: > >=20 > > I recently checked fresh code from CVS and was pleasantly surprised= =20 > > to see that all Quake3 levels that were broken are now perfect - in fac= t=20 > > I cannot find anything that is amiss ! > >=20 > > Do you think it would be a good idea to tag the current code and mak= e=20 > > a snapshot ? Sure, anytime :) > So have you guys given any consideration to moving the r300 driver into=20 > mesa proper? CVS access shouldn't be a problem, fwiw... There are two main points that have stopped me from pushing for the=20 inclusion of the driver into Mesa proper: 1. Kernel-level security holes We should take care of full command-stream verification before moving the=20 driver into Mesa CVS. It's easy to say "We can do that later", but if we=20 say that it's likely that it won't be done in a long time. 2. DRM binary compatibility We still don't know the meaning of many of the registers. Some registers ar= e=20 labelled "dangerous" which means we might have to do some more checks in=20 the kernel to make sure user processes can't do harmful stuff. This means=20 that we might have to *remove* some of the cmdbuf commands that exist today= =20 in the future. If the others believe moving r300 to Mesa is a good idea, then I'll do some= =20 auditing to the DRM code. Once I (or somebody else) has done this, I'm okay= =20 with moving the driver as long as we don't enforce DRM binary compatibility= =20 yet. cu, Nicolai |
From: Alex D. <ale...@gm...> - 2005-05-19 16:15:19
|
On 5/19/05, Nicolai Haehnle <pre...@gm...> wrote: > On Thursday 19 May 2005 09:20, Keith Whitwell wrote: > > Vladimir Dergachev wrote: > > > > > > Hi Aapo, Ben, Jerome, Nicolai: > > > > > > I recently checked fresh code from CVS and was pleasantly surprise= d > > > to see that all Quake3 levels that were broken are now perfect - in f= act > > > I cannot find anything that is amiss ! > > > > > > Do you think it would be a good idea to tag the current code and m= ake > > > a snapshot ? >=20 > Sure, anytime :) >=20 > > So have you guys given any consideration to moving the r300 driver into > > mesa proper? CVS access shouldn't be a problem, fwiw... >=20 > There are two main points that have stopped me from pushing for the > inclusion of the driver into Mesa proper: >=20 > 1. Kernel-level security holes > We should take care of full command-stream verification before moving the > driver into Mesa CVS. It's easy to say "We can do that later", but if we > say that it's likely that it won't be done in a long time. >=20 > 2. DRM binary compatibility > We still don't know the meaning of many of the registers. Some registers = are > labelled "dangerous" which means we might have to do some more checks in > the kernel to make sure user processes can't do harmful stuff. This means > that we might have to *remove* some of the cmdbuf commands that exist tod= ay > in the future. >=20 > If the others believe moving r300 to Mesa is a good idea, then I'll do so= me > auditing to the DRM code. Once I (or somebody else) has done this, I'm ok= ay > with moving the driver as long as we don't enforce DRM binary compatibili= ty > yet. >=20 > cu, > Nicolai >=20 true, the drm may need to live on it's own branch for a bit. But I think the 3d driver could be added to mesa. it can live in cvs until we feel it is ready to be part of a stable release. Being part of mesa cvs means we'll always be in sync with the latest mesa changes and get nightly snapshots which will add additional testers for better or worse. Alex |
From: Jerome G. <j.g...@gm...> - 2005-05-19 14:35:29
|
On 5/19/05, Boris Peterbarg <bo...@za...> wrote:>=20 > Well, I tried now - I have an old computer...well, the box anyway. Had > to switch all the cables between the two computers all the time. > First, the program that I ran consumed 100%. After killing it, X > consumed 100% and took about 15 minutes to be killed. The last image was > still stuck on the display, though. Thus i think that we face the same problem as r200 :( seems a hard bug to find.... Jerome Glisse |
From: Vladimir D. <vo...@mi...> - 2005-05-19 15:49:40
|
On Thu, 19 May 2005, Jerome Glisse wrote: > > Thus what we may do is use this table or use your swizzle > function. Table lookup are faster but we doesn't have to translate > code often, no ? An intermediate approach would be to have an "if" function that is easier to read (and debug) but instead of using return values directly populate the table with the return values during startup and then use the table. This way we retain the readability and have the speedup from lookups. best Vladimir Dergachev |
From: Jerome G. <j.g...@gm...> - 2005-05-19 17:02:45
|
On 5/19/05, Vladimir Dergachev <vo...@mi...> wrote: > On Thu, 19 May 2005, Jerome Glisse wrote: > > > > Thus what we may do is use this table or use your swizzle > > function. Table lookup are faster but we doesn't have to translate > > code often, no ? >=20 > An intermediate approach would be to have an "if" function that is easier > to read (and debug) but instead of using return values directly populate > the table with the return values during startup and then use the table. Did i said that i already done the table (as a human, at least everythings tends to make me think that i am a human :-)) Thus making a function to redo the table may be bit useless :) =20 Jerome Glisse |
From: Vladimir D. <vo...@mi...> - 2005-05-19 17:27:38
|
On Thu, 19 May 2005, Jerome Glisse wrote: > On 5/19/05, Vladimir Dergachev <vo...@mi...> wrote: >> On Thu, 19 May 2005, Jerome Glisse wrote: >>> >>> Thus what we may do is use this table or use your swizzle >>> function. Table lookup are faster but we doesn't have to translate >>> code often, no ? >> >> An intermediate approach would be to have an "if" function that is easier >> to read (and debug) but instead of using return values directly populate >> the table with the return values during startup and then use the table. > > Did i said that i already done the table (as a human, at least everythings > tends to make me think that i am a human :-)) Thus making a function > to redo the table may be bit useless :) The function approach might be useful if there are fewer test cases than entries in the table. In this case we win on code maintainability.. best Vladimir Dergachev > > Jerome Glisse > |