Menu

#570 MSX turboR is going too slow

Next_release
open
nobody
MSXturboR (1)
2
2015-03-07
2015-03-04
No

The program posted here http://www.msx.org/forum/msx-talk/development/basic-hat?page=1:

10 DIM RR(320)
100 COLOR 15,1,1:SCREEN 6
110 FOR I=0 TO 320:RR(I)=193:NEXT I
130 XP=144:XR=4.71238905#:XF=XR/XP
140 FOR ZI=64 TO -64 STEP -1
150 ZT=ZI*2.25:ZS=ZT*ZT
160 XL=INT(SQR(20736-ZS)+.5)
170 ZX=ZI+160:ZY=90+ZI
180 FOR XI=0 TO XL
190 XT=SQR(XI*XI+ZS)*XF
200 YY=(SIN(XT)+SIN(XT*3)*.4)*56
210 X1=XI+ZX:Y1=ZY-YY
220 IF RR(X1)>Y1 THEN RR(X1)=Y1:PSET (X1,Y1),15
230 X1=ZX-XI
240 IF RR(X1)>Y1 THEN RR(X1)=Y1:PSET (X1,Y1),15
250 NEXT XI:NEXT ZI
260 GOTO 260

is supposed to run in about 22 seconds on a turboR GT in R800 mode with MSX-BASIC-kun 2.1. But in openMSX it takes much longer: about 58 seconds.

Interesting, because we also have a ticket that R800 emulation is too slow!

Looks like on blueMSX the time is much more like real hw: about 23 seconds.

Discussion

  • Wouter Vermaelen

    There certainly are timing issues in the R800 emulation in openMSX. But a factor 2x-3x is very large. Therefore I'd suggest to first investigate other possible causes for this difference.

    The 3 measurements discussed in the MRC forum where performed by 3 different people. Did all 3 test exactly the same configuration? E.g.

    • Z80 vs R800-ROM vs R800-DRAM
    • The Basic program typed in exactly the same way, including spaces (though maybe for basic-kun this matters less).
    • But my prime suspect is the basic-kun code. Was it loaded in the same way? It makes a huge difference whether it's located in an external ROM vs loading it in RAM (IIRC with something like bload"xbasic.bin",r)

    Could someone investigate this? So preferably run both with an external basic-kun rom and with basic-kun loaded in RAM, on both a real machine and in openMSX. And if possible the same test on blueMSX.

     
  • Manuel Bilderbeek

    OK, we were comparing apples and oranges...

    Did the same test myself using Basic'n 2.1 loaded in RAM and adding timing with TIME in the program. This gave the following result:
    openMSX: 23.15 s
    real hw: 22.92

    So, openMSX is a tad too slow, but not much off at all!

     
  • Manuel Bilderbeek

    • Priority: 5 --> 2
     
  • Manuel Bilderbeek

    Result in openMSX on R800 (enabled by booting without floppy) without using MSX-BASIC'n: 940.43 seconds.
    On my real GT: 929.62 seconds.

    So, openMSX is indeed still a bit too slow.

     
  • Laurens Holst

    Laurens Holst - 2015-03-07

    I’ve performed cycle-accurate R800 timing tests on my turboR GT, see the R800+wait column in the table here:

    http://map.grauw.nl/resources/z80instr.php

    The testing methodology I used is as follows: I have an OPL4 wave register write routine which requires at least 8 clocks between address and data writes. One is used for a register load, the remaining 7 clocks need to be spent waiting. By using these instructions for waiting followed by a variable number of 1-cycle nop delays, I was able to acquire cycle-accurate timing information.

    For the instructions between 7 and 13 cycles, I performed a similar tests using the internal OPLL. Eight instructions remain to be tested at the moment (LDIR, OTIR, etc.), I’ll have to adapt my OPLL test to glitch between data and address writes.

    As for the instructions which do not talk to RAM or I/O, I did not perform any tests and simply copied the numbers from the MSX datapack. Based on wouterv’s earlier tests I’m confident that no additional waits are inserted for them.

     
  • Laurens Holst

    Laurens Holst - 2015-03-07

    I noted a discrepancy between the timing of the CALL instruction, in my measurements the result is 8 cycles whereas the r800test.txt document says it’s 7.

    I double-checked my CALL test result today and again I arrived at 8 cycles. However following wouterv’s suggestion I also tried the combination call + pop af, and the result was 12 cycles rather than 13! I’ve also inserted some nops between the call and the pop just to make sure the addresses weren’t contiguous.

    A few quick examples illustrating the testing method: http://pastebin.com/t2Mx0CjH

    It’s quite easy to test by taking the VGMPlay code, applying these changes, and playing an OPLL song. Of course before you do check first if the generated code is not crossing a 256-byte memory page boundary.

     
MongoDB Logo MongoDB