Menu

#1979 x128 gets stuck when opening borders with Z80

v3.x
open
None
x128
2024-04-25
2024-01-13
No

I experimented with opening borders with Z80. The program z80openborders.prg works fine with a real PAL C128 and Z64K. VICE x128, however, gets stuck.

After a moment the program should return to the READY prompt. With VICE x128 this does not happen. This appears to be related to interrupts or HALT instruction. The counter in register BC is not decreased.

Code listing below.

main    .org   $1c01
    .byte $0c,$08,$0a,$00,$9e,$37,$31,$38,$31,$00,$00,$00
    lda $ff00 ;
    pha ; store RAM config to stack
    sei ; disable interrupts
    lda $ffee
    pha
    lda $ffef
    pha
    lda $fff0
    pha
    lda #$3e ;
    sta $ff00 ; select RAM config for Z80

    lda #$c3 ;
    sta $ffee ; store JP instruction for Z80 mode start
    lda #<z80code ;
    sta $ffef ; store lo-byte address
    lda #>z80code;
    sta $fff0 ; store hi-byte address
    lda $d505 ;
    pha ; store mode to stack
    lda #$b0 ;
    sta $d505 ; set Z80 mode - this instruction deactivates 8502 and jumps by Z80 PC to $ffee
    nop
    nop
    pla
    sta $d505
    pla
    sta $fff0
    pla
    sta $ffef
    pla
    sta $ffee
    pla
    sta $ff00
    cli
    rts
z80code     

.byte $F3;              di
.byte $01 $00 $02;      ld bc,0200h
.byte $11 $00 $38;      ld de,3800h
.byte $21;      ld hl
.byte <endcode
.byte >endcode
.byte $ED $B0;          ldir
.byte $C3 $00 $38;      jp 3800h

endcode
    .org    3800h

    ld  sp,37F0h

    ld  bc,0D012h
    ld  a,0f9h
    out (c),a
    ld  bc,0dc0eh
    in  a,(c)
    ld  (3101h),a
    ld  a,0f9h
    out (c),a
    ld  a,01h
    ld  bc,0d019h
    out (c),a
    inc c
    out (c),a
    ld  a,01bh
    ld  bc,0d011h
    out (c),a



    ld  bc,3000h
    ld  a,31h
_loop1:
    ld  (bc),a
    inc c
    jr nz,  _loop1
    inc b
    ld  (bc),a
    ld  a,0c3h
    ld  (3131h),a
    ld  bc,irq
    ld  (3132h),bc

    ld  a,30h
    ld  i,a
    im  2
    ei
    ld  bc,500

_loop3:
**  halt
    dec bc**
    ld  a,b
    cp  0ffh
    jr nz,  _loop3
    ld  bc,0dc0eh
    ld  a,(3101h)
    out (c),a
    jp  0ffe0h
irq:
    push    af
    push    bc
    ld  bc,0d019h
    ld  a,01h
    out (c),a
    ld  bc,0d020h
    in  a,(c)
    inc a
    out (c),a
    ld  bc,0d011h
    in  a,(c)
    and 0f7h
    out (c),a
    inc c
_loop2:
    in  a,(c)
    cp  00h
    jr nz,  _loop2
    dec c
    in  a,(c)
    and 7fh
    or  08h
    out (c),a
    ld  bc,0d020h
    in  a,(c)
    dec a
    out (c),a
    pop bc
    pop af
    ei
    ret


    .end
2 Attachments

Discussion

1 2 3 > >> (Page 1 of 3)
  • Jussi Ala-Könni

    It appears that Z80 HALT is not properly handled.
    Attached is a test program, which was tested on two PAL C128s.
    x128 does not respond.

     
  • William McCabe

    William McCabe - 2024-01-28

    Are you able to provide the source code for z80rastertimingtest.prg? I note you've also included a timing test for writing to the Z80 IO port!

     
  • Jussi Ala-Könni

    Sources attached. It uses both IM1 and IM2 and two interrupt handlers with slightly different timing.

     
  • Jussi Ala-Könni

    Edit: a few inconsequential changes. - 2.2.24 minor fixes; added tentative (not confirmed on real hardware) NTSC version of the test.
    2024-02-20: a minor fix.

     

    Last edit: Jussi Ala-Könni 2024-02-20
  • Jussi Ala-Könni

    Hmm. Apparently no one is interested in fixing this. That is disappointing to say at least, I would say Z80 HALT and raster interrupts work pretty well together, all raster jitter is simply absent. Instead, an outright stability in all my experiments. Achieving a stable raster interrupt is pretty simple task on C128 and Z80. The mess with stable raster interrupt routines is simply absent.

     
  • Jussi Ala-Könni

    Just a casual example of opening the side border demonstrates how easy things actually are using Z80:

    main    .org   $1c01
        .byte $0c,$08,$0a,$00,$9e,$37,$31,$38,$31,$00,$00,$00
        lda #$be ; select RAM bank 2 with I/O for Z80
        sta $ff00 ; to put Z80 BIOS out of the way
        sei ; disable interrupts
        lda #$00 ;disable 2 Mhz
        sta $d030
        lda #$c3
        sta $ffee ; store JP instruction for Z80 mode start
        lda #<z80code ; store lo-byte address
        sta $ffef
        lda #>z80code ; and hi-byte address of Z80 code
        sta $fff0 ;so we can wake up the despised Z80!
    
    _main   ;8502 main, no interrupt
        lda #$b0    ;give control to Z80
        sta $d505   ;in order to wait for the next frame
        nop
        ldx #$07
        nop
        nop
        nop
        nop
        nop
        nop
    _loop:
        ldy #$00 ;open side border
        sty $d016
        ldy #$08
        sty $d016
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        nop
        dex
        bne _loop
        jmp _main
    z80code
    
    ---
    
        .org    1c5ch
    
        di
        ld  sp,2000h    ;set up stack
        im  1       ;interrupt mode 1
        ld  a,0c3h
        ld  (0038h),a
        ld  bc,_irq     ;set up JP interrupt addresses
        ld  (0039h),bc
        ld  bc,_return
        ld  (0ffefh),bc
        ld  bc,0d011h   ;set up raster interrupt line
        ld  a,1bh
        out (c),a
        inc c
        ld  a,91
        out (c),a
        call    _irq        ;call irq handler to acknowledge interrupt
    _return:
        nop
        halt
        jp  0ffe6h ;give control back to 8502
    
    _irq:   ;interrupt handler
        ld  bc,0d019h   ;acknowledge raster interrupt
        ld  a,01h
        out (c),a
        ei
        ret         ;this is all there is to do, then return
    
        .end
    

    Needless to say, VICE does not run it. Z64K is not perfect in Z80 related timings either, but it runs this example.

    Edit. This example is for PAL C128.

     

    Last edit: Jussi Ala-Könni 2024-02-11
  • Roberto Muscedere

    • assigned_to: Roberto Muscedere
     
  • Roberto Muscedere

    I'll be looking into this. I will take me a bit to catch up on the z80 cpu code. I did some work with it before, but from what I can see right now, it doesn't handle halt correctly. It seems it just waits 4 cycles and moves on. It should wait for an interrupt, but it clearly doesn't. I will also have to be careful with this as the same core code relates to other z80 add-ons, like the commodore cpm cart.

     
  • Jussi Ala-Könni

    I have a new test program based on an earlier C64 test gfxfetch made by Hannu Nuotio and Antti Lankila. I found it interesting to test these ideas with C128 and Z80 cpu. It runs properly only on real hardware (PAL C128).

    Source will be provided if there is interest, I will clean it up a bit first.

    And link to the original test:
    https://sourceforge.net/p/vice-emu/code/HEAD/tree/testprogs/VICII/gfxfetch/

     

    Last edit: Jussi Ala-Könni 2024-02-17
  • Jussi Ala-Könni

    I did some tests with CIA timer measuring the number of cycles between successive interrupts, going from one scan line to the next.
    Results from real PAL C128 seem solid: the timing is stable and depending on the scan line; there is 2 cycle difference in half of the measurements, depending on the scan line and number of Z80 cycles run.

    Z80 and raster interrupt seems to be in sync which explains the lack of jitter.

    x128 does not run the test and Z64K has a phase error.

     

    Last edit: Jussi Ala-Könni 2024-03-06
  • Roberto Muscedere

    Okay. I committed an initial fix for this in r45031. The halt is handled, but the timing results seem to look the same as z64k. I'm not exactly sure what you are doing here to derive these numbers, but I don't think the VIC and CIA are the best ways to do this because of the badlines. If you are testing instruction timing, I suggest you look at:
    textprogs/general/Lorenz-2.15/src/cputiming.s
    It uses a very creating way to test instruction timing. As far as I know, we don't have a way of verifying the timing of the z80 emulation. We have a functional tester (zex), but I don't think it does timing as it is generic to CPM.

     
    • gpz

      gpz - 2024-03-22

      The "Risen From Obvlivion" Demo measures the z80 Timing vs the VICII (iirc) at startup, perhaps that can serve as a starting point for a test program

       
    • Jussi Ala-Könni

      halt_timingtest simply measures the interval between interrupts of successive scanlines, on a PAL machine the value is indeed expected to be close to 65536 - 63, but it varies since HALT takes n * 2 1MHz clock cycles to execute. Only when the total time spent on a scanline equals to 63, is constant timing (=value FFC1) expected to be seen. When it is not, during HALT state one more or less NOP (= 2 1MHz cycles) is spent on a scanline, and that time seems to depend on a scanline. Bad lines are not measured here (there is also not enough cpu time to measure a badline). The result seems to confirm that regularization happens, possibly during badlines. Z64K here seems to have a phase opposite to that of real hardware; in general Z64K has an accurate basic run of Z80 emulation (on a PAL machine one scanline is 126 Z80 cycles).

      I have other tests which draw testbars on the screen, which previously did not run, but now x128 actually gave quite a good result; I was expecting a total mess on the screen, but instead it seems that x128 is consistently retarded a bit too much during Z80 code execution: 2 1MHz cycles per scanline. 8502 bar is straight as expected.

       
      • Roberto Muscedere

        To be honest, I'm not sure if I got the whole interrupt thing right. I'm reading data sheets and other materials and it is a little vague. There is no tester for this stuff. So given that I'm at best a novice with Z80 coding and you clearly have more skills, you should develop a good tester for this. I think you are on track but should avoid the VIC and blank the screen. Just use the CIAs and fire interrupts at a particulars time to measure when the IRQ code runs via inspection of the CIA timers; also check the stack contents to see what return address is placed there. Also keep in mind about the clock stretching when accessing any IO locations.
        The zex instruction exerciser only tests basic ALU and memory operations. It doesn't cover everything, so knowing how soon an interrupt is processed after a EI, or if an interrupt is stopped immediately after a DI is useful.
        Running repetitive instructions and measuring the overall time difference and then dividing by the number of instructions can also give us an idea if we have the delays correct in emulation. I saw some test program that shows LDIR is slower on vice than a real machine (https://csdb.dk/release/?id=170651). I'm not sure why as I've checked the numbers, but there are a lot of z80 variants out there so I'm not is they are the same as the one in the c128.
        I'm still not 100% sure how the z80 is handled timing in VICE. This area of the system is new to me so I have to look into if further.

         
        • Jussi Ala-Könni

          I have also been puzzled how Z80 raster interrupts work. I suspect that some kind of regularization happens, but I don't fully understand how it works. I think it may happen during bad lines.

          What comes to clock stretching, adding or removing one Z80 cycle may or may not make a difference in 1 MHz I/O output, but there are no delays. See that earlier rastertimingtest.asm source - there are two interrupt routines, which are 1 Z80 cycle apart, but produce the same output on real hardware. No delays - OUTs take 12 Z80 cycles = 6 1MHz cycles. Good that you mentioned it - I think I recognized the cause of the timing error. rastertimingtest.prg has 3 OUTs in the loop, and I identifier 4 (1MHz) cycle delay in output. Similarly, testcpuswitch has 2 OUTs in the loop, and 2 (1MHz) cycle delay in output! So, if you tried what happens if the delay with OUTs is removed?

          (Edit. 3 OUTs in the loop part, no delays associated with them, so just a clean calculation of 126 cycles (PAL) for non-bad raster lines.)

          When EI is executed, interrupt is executed after the instruction following EI. So for example if

          EI
          RET
          

          is coded, interrupt can be processed after RET, not before.
          (Edit: after RET, in place of after EI)

          Interrupts are checked after execution of every instruction, so that is to be understood that immediately after DI interrupt is not processed.

          http://www.z80.info/interrup.htm

           

          Last edit: Jussi Ala-Könni 2024-03-25
        • Jussi Ala-Könni

          Z80 CycleTimer gives 10.50634770 cycles per byte for a real C128, which is very close to the theoretical value, which is 10.5. Which again confirms that number of cycles can be taken "from the book". I am not aware of any differences of cycles taken by Z80 CPUs of different batches.

           
        • Jussi Ala-Könni

          Z80 instruction breakdown:
          http://www.z80.info/z80ins.txt
          DI and EI behavior described (p. 21):
          http://www.z80.info/zip/z80-documented.pdf

           
  • Jussi Ala-Könni

    Actually, series of EI instructions cannot be interrupted; when series of EI instructions is programmed, interrupts are enabled only after the next instruction after EI. This has been described somewhere, I can try to find the source, but I wrote a test program for this.

    Interrupt is triggered in the middle of series of EI instructions; there is 10 EI instructions still to be executed, which make 20 1MHz cycles, plus the res instruction which makes 12 more, until the interrupt is processed.

    Z64K processes the interrupt in the middle of EI instruction series, which is incorrect, giving the value FFC1h = -63. x128 apparently processes one or two more EI, giving FFBDh = -67.

    Real hardware with real Z80 gives FFA1h = -95. The difference FFC1h-FFA1h = 20h = 32 agrees with the calculation above.

    So, to repeat: series of EI instructions cannot be interrupted, interrupts are processed only after the next instruction after the series. Years ago I wrote my own Z80 emulation engine and implemented it in this way, since I saw it so described. Nice to see it confirmed.

    So, both VICE and Z64K are incorrect here.

    main    .org   $1c01
        .byte $0c,$08,$0a,$00,$9e,$37,$31,$38,$31,$00,$00,$00
        lda #$be ; select RAM bank 2 with I/O for Z80
        sta $ff00 ; to put Z80 BIOS out of the way
        sei ; disable interrupts
        lda #$00 ;disable 2 Mhz
        sta $d030
        lda #$c3
        sta $ffee ; store JP instruction for Z80 mode start
        lda #<z80code ; store lo-byte address
        sta $ffef
        lda #>z80code ; and hi-byte address
        sta $fff0 ;of Z80 code
        lda #$b0 ; load Z80 configuration
        sta $d505 ; to MMU MCR register
        nop
        lda #$cf ;store back RST instruction
        sta $ffee
        lda #$00 ;set configuration to ROM
        sta $ff00
        cli
        jsr $ff7d
        .byte $0d,'EI INSTRUCTION TIMING TEST'
        .byte $0d,'TIMER VALUE AFTER THE TEST',$0d,'(REAL PAL C128: FFA1): ',$00
        lda $2f00
        ldx $2f01
        jmp $b89f
    
    z80code     
    
    ---
    
        .org    1c93h
    
        di
        ld  sp,3000h ;set up stack area
        ld  hl,(0038h)
        push    hl
        ld  hl,(003ah)
        push    hl
        ld  a,0c3h ;set up IM1 RET instruction
        ld  (0038h),a
        ld  hl,_irq
        ld  (0039h),hl
        ld  bc,0dc0dh
        ld  a,7fh   ;prepare timer
        out (c),a
        inc c
        xor a
        out (c),a
        dec a
        ld  c,04h
        out (c),a
        inc c
        out (c),a
        ld  bc,0d011h ;set up raster interrupt line
        ld  a,1bh
        out (c),a
        inc c
        ld  a,60    ;raster line
        ld  d,a
        inc d
        out (c),a
        im  1
        ld  c,19h ;acknowledge interrupt
        ld  e,01h
        out (c),e
        exx
        ld  bc,0dc0eh ;store values to shadow registers
        ld  d,19h ;timer start-stop values and address
        exx
        ld  ix,0000h
    
    ;   measurement
    
        ei
        nop
        halt    ;let one interrupt pass before measurement
        ei
        nop
        out (c),e ;acknowledge
        exx
        halt    ;measurement begins after halt
            ;begin measurement      
        nop
        exx
        out (c),e
        ld  c,12h ;change raster line
        out (c),d
        ld  c,19h
        exx
        ld  d,08h;102
        ei
        ei
        ei
        ei
        ei
        ei
            ;interrupt is triggered here (= 63 1MHz cycles; 126 z80 cycles after HALT)
        ei
        ei
        ei
        ei
        ei
        ei
        ei
        ei
        ei
        ei  ;10 x 2 1Mhz cycles over
        res 0,(ix+0); 12 more are added
    
            ;interrupt actually occurs here
    
        nop
        ld  c,04h ;read timer
        in  l,(c)
        inc c
        in  h,(c)
        ld  (2f00h),hl ;store timer value to memory
    ;
        ld  c,12h   ;change raster line back to the default one
        ld  a,0ffh
        out (c),a
    ;
        pop hl
        ld  (003ah),hl
        pop hl
        ld  (0038h),hl
        jp  0ffe0h ;return
    
    _irq:   ;interrupt handler
        out (c),d
        ret
    
        .end
    
     

    Last edit: Jussi Ala-Könni 2024-03-25
  • Roberto Muscedere

    I've been doing a deep dive on the current state of the code and how the z80 clock relates to the 1MHz clock. I've been able to get closer to the LDIR test as the cycles counts weren't correct for that instruction in VICE, but I can't seem to get a firm grasp on the clock stretching. As such, I would prefer if you could avoid using any OUT/IN instructions at this point. Try to just use registers, ie. a series of increments to see what the register value should be in the interrupt service routine. I know if makes things more challenging, but right now I don't know the effects of clock stretching on the 1MHz cycle count. From what I can see with the LDIR test, I am off by 1 cycle without clock stretching which is impossible. So either the cycle counts are wrong or something weird is happening when the z80 is running code. Maybe they do the z80 refresh during a 8502 refresh cycle.
    Another way to maybe do the cycle counting is to use the 8502 to setup and read the timer at 1MHz so we can avoid the z80 clock stretching entirely.

     
    • Jussi Ala-Könni

      I suggest you check cycle count accuracy first, the test disables VIC screen, then there shouldn't be anything weird. If this is any clue, Z80 can be freely combined, for example, 7 +7 Z80 cycles executes at the same speed as 10 + 4 (judging on the basis of VIC display stability). There is no "jerkiness" or "rounding" to more coarse 1 MHz cycles, if you understand what I mean.

       
      • Roberto Muscedere

        So far, everything looks right, but as I said, I'm 1 cycle off without stretching. There are a lot of moving parts in VICE so it helps to have testers that focus on one thing. By removing clock stretching from the measurements, I can determine if the cycle count is proper and then move on from there.
        The most reliable way is to use the CIA timer on the 8502 and avoid the VIC for anything as x128 is not cycle exact, so anything with the raster may be an issue elsewhere. I'm trying just to focus on the z80 right now.

         
  • Roberto Muscedere

    Okay. I committed another patch that fixed the LDIR test and gets better results in all of your tests. See r45044.
    It seems the Z80 doesn't do clock stretching as all of the memory and IO operations work at 1MHz.

     
    • Jussi Ala-Könni

      At the moment r45044 is still delayed and not available for download.

       
      • Roberto Muscedere

        Ugh. Trailing white space. Get r45045.
        I'm thinking the timing issues now are based on when the timer is read in the IN execution. VICE emulates CPUs one instruction at a time; the Z80 on the c128 is clocked at 2MHz, but the memory and IO are at 1MHz. So depending on when the instruction "runs" with respect to the read from the CIA may be the problem.
        To test this, if you can add an "odd" number of cycle delays before the IN on measurements that are currently correct, it should make the measurements wrong versus real hardware.

         
1 2 3 > >> (Page 1 of 3)

Log in to post a comment.

MongoDB Logo MongoDB