while comparing the runtime effect of some codegen changes I found a ucsim behavior that I am unable to explain.
I attached 2 version of the rotate2_size_16_andCase_1_xorLiteral_1_rotateLeft_0_structVar_1 including the asm and the ixh
the ref version is shorter and by analyzing the asm outpush should be also faster.
however according to ucsim the ref implementation takes 167 cycles more.
Difference comes from branch instructions where destination is on other page. Ref executes more like branches, and they need 1 extra cycle to execute.