RE: [Open64-devel] How to let gfec to work like this?
Brought to you by:
ributzka,
suneeljain
From: Chan, S. C <sun...@in...> - 2004-01-05 17:46:41
|
For the same token, after some phase during the middle of the backends, before CG, one could generate the alleged __I4_abs_sub_acc. The point is, there is plenty of opportunities between FE and CG. The gcc of the world thinks of optimizations in black and white, if not done in CG, then FE. =20 Sun =20 -----Original Message----- From: Barak Zalstein [mailto:Bar...@ce...]=20 Sent: Thursday, January 01, 2004 3:07 AM To: Chan, Sun C; Chen, William; ope...@li... Subject: RE: [Open64-devel] How to let gfec to work like this? =20 One point for such form:=20 Even though there are plenty of opportunities to automatically recognize CISC-like opcodes in later phases=20 (combine, peephole, etc.) some of the opportunities could be missed if not done early in the front-end.=20 For example, if there was one instruction that has the functionality of *costptr +=3D ABS( newx - new_mean ) - ABS( oldx - old_mean ) ; The front-end could detect such a pattern and generate a .B file that contains: *costptr =3D __I4_abs_sub_accumulate (*costptr, newx, new_mean, oldx, old_mean); // if result is not equivalent to kid0, copy kid0 to result first. =20 The drawback in generating this early is that it might be more difficult to recognize redundant code coming in/out of such an intrinsic. =20 OTOH, detecting the same form in the CG: tmp =3D * costptr; tmp +=3D ABS( newx - new_mean ) ; // converted to if-then-else blocks or predicated form tmp +=3D ABS( oldx - old_mean ) ; *costptr=3Dtmp might not always work properly because tmp could have been replaced with multiple TNs, or other reason (scheduling) caused an EBO_tn_available check to fail. =20 Barak -----Original Message----- From: Chan, Sun C [mailto:sun...@in...] Sent: Tuesday, December 30, 2003 10:33 PM To: Barak Zalstein; Chen, William; ope...@li... Subject: RE: [Open64-devel] How to let gfec to work like this? I am surprised that after preopt (or mainopt), copy prop does not give you the expression *costptr =3D *costptr + ABS(newx-new_mean)-ABS(oldx-old_mean). Do you really need the form you indicated: Tmp =3D abs(....) *costptr =3D *costptr + tmp What is the point for this form? Sun =20 -----Original Message----- From: ope...@li... [mailto:ope...@li...] On Behalf Of Barak Zalstein Sent: Monday, December 22, 2003 2:26 AM To: Chen, William; ope...@li... Subject: RE: [Open64-devel] How to let gfec to work like this? =20 I did not check this with current ORC, but this should be quite similar: The code is generated because the tree that contains indirect_ref to parm_decl costptr (representing *costptr) is both arg 0 of the <stmt> parameter when expand_stmt_with_iterators_1 is called (with a modify_expr tree), =20 but after few recursive calls to expand_expr, expand_assignment and store_expr it is found again (thus yielding=20 the "tmp =3D *costptr" code). One way to change this behavior would be to modify the above function(s) to discover that the same tree is appearing in both sides of an expression and simplify it so that assign_temp and store_expr are done in different location. (Maybe yyparse is a better location to modify the behavior of "+=3D" operations, but I'm not sure). =20 Barak. -----Original Message----- From: ope...@li... [mailto:ope...@li...]On Behalf Of Chen, William Sent: Monday, December 22, 2003 5:53 AM To: ope...@li... Subject: [Open64-devel] How to let gfec to work like this? Hi all, =20 I need gfec(front end of orc) to generate code in pv26.c =20 *costptr +=3D ABS( newx - new_mean ) - ABS( oldx - old_mean ) ; to be=20 =20 tmp =3D ABS( newx - new_mean ) - ABS( oldx - old_mean ) ; * costptr +=3D tmp; =20 instead of current result =20 tmp =3D * costptr; tmp +=3D ABS( newx - new_mean ) ; tmp +=3D ABS( oldx - old_mean ) ; *costptr=3Dtmp =20 Here, ABS is a macro=20 =20 #define ABS(value) ( (value)>=3D0 ? (value) : -(value) ) .=20 =20 =20 How to let gfec to work like this? =20 Thanks, William =20 =20 |