Menu

#43 WB clock background is wrong when the system is running on the G5 build

PPCJITBETA04
closed
3-low
2014-07-20
2014-06-01
No

WB clock background is wrong when the system is running on the G5 build and the JIT compiling is enabled.

Thanks to Luigi Burdo for the report.

Discussion

  • Tobias Netzel

    Tobias Netzel - 2014-06-01

    I can confirm that it's because of the mcrxr replacement.
    Started investigating but couldn't identify the problem yet.
    Is using the temporary register always safe?

     
    • Almos Rajnai

      Almos Rajnai - 2014-06-01

      Thanks for looking into this bug.

      Usually it is safe to use the special temp register (R0), because that supposed to be volatile everywhere in the code. (So, it can only be used as a local variable.)
      It is hard to tell whether this is the issue. You can try to allocate a register for the operation in the macroblock (and then it must be passed to the implementation function using the descriptor), but probably it would be better to pin-point the exact instruction which causes the trouble.

      I attached a patch which allocates the temporary register, give it a try.
      If that wouldn't work then I can commit my trace tool to the repository which helps turning on-off the compiling for individual instructions. This is very useful when I try to find the exact instruction for a bug, but this tool was never meant to be released, so no documentation and a bit cumbersome to use.

       

      Last edit: Almos Rajnai 2014-06-01
  • Almos Rajnai

    Almos Rajnai - 2014-06-01
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -1,5 +1,3 @@
     WB clock background is wrong when the system is running on the G5 build and the JIT compiling is enabled.
    
     Thanks to Luigi Burdo for the report.
    -
    -*To be confirmed. I have no G5 machine here to test or fix this issue.*
    
     
  • Tobias Netzel

    Tobias Netzel - 2014-06-02

    Finally managed to track down the problems.
    The optimized replacement for the mcrxr-mfcr sequence suffered from a bug in the mfocrf instruction generation (it read from cr7 instead of cr0, cr6 instead of cr1 and so on) and needed to do the mfxer before the mfcocrf (the interpretive emulation's G5 optimized code was already doing it right).
    The generic mcrxr replacement (currently unused) suffered from a bug in the mtocrf instruction generation - the same one as in mfocrf. It also didn't correctly clear the XER bits 0..3 but that didn't seem to matter anyway.
    Finally in the interpretive emulation, as mfocrf only has to leave all other bits undefined but not necessarily 0, a potential fix was added - leaving nearly zero advantage over the non-G5-optimized version in that case.

     
    • Almos Rajnai

      Almos Rajnai - 2014-06-03

      Thanks, Tobias. The changes are committed now.

       
  • Almos Rajnai

    Almos Rajnai - 2014-06-03
    • status: open --> pending
    • assigned_to: Almos Rajnai
    • Milestone: PARKED --> PPCJITBETA04
     
  • MickJT

    MickJT - 2014-06-13

    Hi Tobias. I only just saw your reply to my private message just now. I used a different e-mail address when signing up here, but thought I would see your reply directly on this site. I can't find any inbox/sent page here.

    These latest changes don't compile. I see:

    cpuemu_0.c: In function ‘op_0000_0_ff’:
    cpuemu_0.c:37:2: error: expected ‘:’ or ‘)’ before string constant
    cpuemu_0.c: In function ‘op_0010_0_ff’:
    cpuemu_0.c:50:2: error: expected ‘:’ or ‘)’ before string constant
    cpuemu_0.c: In function ‘op_0018_0_ff’:
    cpuemu_0.c:64:2: error: expected ‘:’ or ‘)’ before string constant
    cpuemu_0.c: In function ‘op_0020_0_ff’:
    cpuemu_0.c:79:2: error: expected ‘:’ or ‘)’ before string constant
    cpuemu_0.c: In function ‘op_0028_0_ff’:
    cpuemu_0.c:92:2: error: expected ‘:’ or ‘)’ before string constant
    cpuemu_0.c: In function ‘op_0030_0_ff’:
    cpuemu_0.c:107:2: error: expected ‘:’ or ‘)’ before string constant
    cpuemu_0.c: In function ‘op_0038_0_ff’:

    etc.

    That only happens when _ARCH_PWR4 is defined.

    If I add what appears to be a missing comma, to line 43 in m68kops.h, it works.

     

    Last edit: MickJT 2014-06-13
  • Tobias Netzel

    Tobias Netzel - 2014-06-13

    That obviously was an oversight - didn't have the patience to recompile everything.
    Here the missing comma.

     
    • Almos Rajnai

      Almos Rajnai - 2014-06-14

      Thanks, Tobias! Change is committed now.

       
  • MickJT

    MickJT - 2014-06-13

    Thanks Tobias :)

     
  • Tobias Netzel

    Tobias Netzel - 2014-06-20

    Here another fix which makes the code recognize whether there actually are two different registers which can be used to get both the XER and CR (this seems to be the more common case) or if there is actually a single one available and XER has to be inserted into CR first.
    I'm not sure whether it's a wise thing to have the JIT compiler decide between those two code paths (the first one saving one instruction) at emulation runtime. (And there is actually a third unused one: the full mcrxr emulation which needs two more instructions to clear the XER.)

    This also fixes the mfocrf and mt(o)crf instructions to actually be emitted as the single field versions. I've got some difficulties in thinking of the most significant bit as bit 0, so I had set the wrong bit (it didn't harm as it was set in a reserved position).

     
    • Almos Rajnai

      Almos Rajnai - 2014-06-20

      Thanks, Tobias! Patch is committed. I slightly changed the comment for SPECTMP register, the _MAPPED is referring to the mapped form of the register, but this is just semantics, I guess.

      There is nothing wrong with comparing the actual register numbers while the code is compiled: these will be the integer numbers which are used to generate the instruction always.

       
      • Tobias Netzel

        Tobias Netzel - 2014-06-21

        I was thinking of the G5 which doesn't perform well with branchy code. Branchy code will only perform well on it when the branch targets can be predicted reliably. If the branch targets cannot be correctly predicted the G5 will perform worse than a G4 CPU,
        Actually GCC should emit that code in manner the G5 can predict the branch target.

         
        • Almos Rajnai

          Almos Rajnai - 2014-06-21

          I have created a new ticket for the G5-related optimization discussion; [#50]

          Let's continue there, this ticket is not related to the performance.

           

          Related

          Tickets: #50

  • Almos Rajnai

    Almos Rajnai - 2014-07-20
    • Status: pending --> closed
     

Log in to post a comment.