Small Device C Compiler (SDCC) / Bugs / #3721 Z80 Critical section implementation is faulty

Tony Pavlov - 2024-03-30

probably, something like this is possible, but arghhh....

ld a, i push af ld a, i di push af pop de ; put F to E ld hl, #0 add hl, sp ; hl points stored F on stack ld a, e or (hl) ; construct conjuncted flags ld (hl), a ; update the value on stack ... pop af jp po, 1$ ei 1$:

this is so sloooow. and, in theory, not a 100% guarantee.

Last edit: Tony Pavlov 2024-03-30
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Philipp Klaus Krause - 2024-04-02
  
  Even that might fail, if there are two interrupts quickly after each other. I guess we'd need something like this (i.e. check if the ld a, i was interrupted, if yes, repeat):
  
  l0: ld a, #>(l1+256) push af pop af ld a, i l1: jp pe, l2 dec sp dec sp pop af cp a, #>(l1+256) jr nz, l0 l2: push af … pop af jp op, l4 ei l4:
  
  There is an earlier report about this, but from that one, it seemed that only NMOS Z80 were affected: [feature-requests:#476].
  
  Related
  
  Feature Requests: ~~#476~~
  
  Last edit: Philipp Klaus Krause 2024-04-02
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - sverx - 2024-04-02
    
    There is an earlier report about this, but from that one, it seemed that only NMOS Z80 were affected: [feature-requests:#476].
    
    Yes, back then I had wrong info about this.
    
    Anyway I remember reading on some MSX forum of a possible solution that involved zeroing and then checking the top of the stack to see if an interrupt happened while saving the IF flag status. Not sure I can find that topic again...
    
    Related
    
    Feature Requests: ~~#476~~
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Philipp Klaus Krause - 2024-04-02
      
      That is basically what sdcc currently does for --nmos-z80. However, even that can fail, if the ld a, i happens to be code at an address < 256. That's why I suggested to use the label l1, so instead of zeroing the top of the stack, we set it to a value known to be different from the upper byte of l1.
      
      👍
      1
      
      Last edit: Philipp Klaus Krause 2024-04-02
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Philipp Klaus Krause - 2024-04-02
      
      Do we have a source (which could be experiments one of you has done) that CMOS Z80 are definitely affected?
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Tony Pavlov - 2024-04-02
    
    about the two interrupts happening quickly one after the other. that is unlikely happen, more often situation is when the interrupt is pending. but that should not be a problem, because EI/RETI makes pending interrupt fire immediately after return and no instructions are executed in main. so, there is probability, but very very slight. NMI should not be the problem either, as i get from the http://www.myquest.nl/z80undocumented/z80-documented-v0.91.pdf
    
    your code is missing DI, which position is important. DI must always be present, that's the whole point of the critical section, EI must be conditional, depending on the state of the IFF2.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Tony Pavlov - 2024-04-02
    
    also, is not P/V affected by cp a, #>(l1+256), while we need zero for the push af for the correct critical section exit processing?
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Philipp Klaus Krause - 2024-04-02
      
      If the result of the cp is 0, and we thus proceed to the push af, there was no overflow in the subtraction, and thus P/V is set to 0, as we need.
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Tony Pavlov - 2024-04-02
        
        ah, correct. we retry if IFF2==0 arguing with the actual value below stack, not the opposite. but probably not loop there but just transfer NZ condition to P/V flag, is there faster code?
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Philipp Klaus Krause - 2024-04-02
        
        Sorry, I don't understand what you meant here. Are you suggesting to not retry? The NZ condition only tells us that an interrupt happened, not what the state of IFF2 was: the interrupt that resulted in getting to the NZ condition might have been an NMI.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Tony Pavlov - 2024-04-02
        
        As i get the problem, when doing ld a, i, PE condition, if met, is always correct, but PO condition may not reflect the real situation.
        
        By checking the byte below the SP, you are checking if PO condition above was wrong or not. So, you don't have to loop, you need to set PE condition in flags, if bytes not match, and set PO condition in flags, when they match.
        
        after that at the point of l2: PE flag indicate "interrupts enabled", PO indicate "interrupts disabled". You always DI just after l2:.
        
        Last edit: Tony Pavlov 2024-04-02
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Philipp Klaus Krause - 2024-04-02
        
        By stack probing (i.e. checking that byte) I can check if an interrupt occured, which might have resulted in the PE being wrong. But the stack probing will not tell me reliably that the PE was wrong. So when the stack probing tells me there was an interrupt that might have resulted in the wrong PE result, the correct thing to do is to redo the ld a, i.
        
        👍
        1
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Tony Pavlov - 2024-04-02
        
        Here is the ___sdcc_critical_enter by Sergey Belyashov. It is working correctly and there is no loop. But call/ret is slow, we need the same but inline, because on the SMS/MSX-like systems you are typically protecting only the couple of port writes. So we need to make it position-independent (that #>(l1+256) trick above).
        
        ___sdcc_critical_enter:: xor a push af pop af ld a, i di ret pe ;enabled interrupts dec sp dec sp pop af or a ;A = 0 if interrupts disabled jr NZ, 00100$ ;inetrrupts disabled sub a ;force P/V = 0 ret ;interrupts enabled 00100$: xor a ;force P/V = 1 ret
        
        combination of the both approaches may be:
        
        ld a, #>(l1+256) push af pop af l1: ld a, i di jp pe, l3 ; jump if interrupts enabled dec sp dec sp pop af sub #>(l1+256) jp NZ, l2 ; inetrrupts disabled jp l3 ; if ZF then P/V = 0 ; interrupts enabled l2: xor a, a ; force P/V = 1 l3: push af ; critical section payload ... pop af jp op, l4 ei l4:
        
        jp is faster on Z80, so use them
        
        Last edit: Tony Pavlov 2024-04-02
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Philipp Klaus Krause - 2024-04-02
        
        While Sergey's code is what SDCC currently uses for --nmos-z80, I now don't think it is fully correct. I think it has two bugs:
        1) Possible false positive: Consider the following case: Code placed at address >= 256. Interrupts are disabled, we enter that code. Just after the ld a, i a NMI happens. PE will be false, but the or a will give a nonzero value. Thus the function will return true, and at the end of the critical section, interrupts will be enabled, even though they were disabled before.
        2) Possible false negative: If that code happens to be placed at an address < 256, and interrupts are enabled, we enter that code, a normal interrupt happens at ld a, i, PE will be false, but the or a will give a zero value (since that stack byte contains the upper byte of the return address for the interrupt routine, which is 0 fr an address < 256). So after the critical section, interrupts will be disabled, though they were enabled before.
        
        Regarding the call/ret overhead: that is just a code speed / size trade-off, so I guess the variant should be chosen according to --opt-code-size vs. --opt-code-speed.
        
        👍
        1
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
        
        Tony Pavlov - 2024-04-02
        
        yes, you are right about the NMI. looks like looping is the only option.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Philipp Klaus Krause - 2024-04-02

Do we know, which Z80 variants are affected, other than z80? r2k, r2ka and r3ka use a differnet approach, and should be fine. I've seen a claim that r800 is not affected (https://www.msx.org/forum/msx-talk/development/do-not-use-ld-ai-read-interrupt-enabled-state?page=0). Neither z180 nor ez80_z80 documentation has the "If an interrupt occurs during execution of this instruction, the Parity flag contains a 0." sentence, that z80 documentation has. z80n aims for bug-for-big compatibility with z80, so I assume that it is affected. sm83 and tlcs90 have only plain di/ei critical sections.
z80 with --nmos-z80 already uses something similar to what I proposed above.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Tony Pavlov - 2024-04-02
  
  actually, there is no much sense in knowing which z80 variants are affected from the point of writing the user code. because if you build for some target device, like ZX Spectrum, you never know which type of the CPU your code actually will run on, not even know if it is genuine Zilog or whatever obscure clones including something like the soviet Т34ВМ1. So more "strict" checking should probably be default.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Philipp Klaus Krause - 2024-04-02
    
    Well, if the user compiles for the z80 target, we will have to do the correct thing for Z80 (indeed we do not know which Z80-compatible system the code will run on). But if e.g. Z180 and Z180compatible devices all reliably report IFF2 in PE, we can still use the old, shorter code for users that compile for the z180 target.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Tony Pavlov - 2024-04-02
      
      well, is not user already pass his intention for z180 code with the -mz180 and there is no need for the additional flag --nmos-z80 neither its inverted version if you add one.
      
      so, my suggestion is that --nmos-z80(not the exact current call, but some optimized approach we discuss above!) should be the default behavior for the z80 systems, and the opposite behavior flag is needed instead, if the user is sure that his exact Z80 is 100% safe.
      
      Last edit: Tony Pavlov 2024-04-02
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
      - Philipp Klaus Krause - 2024-04-02
        
        Yes, removing --nmos-z80, making the safe version the default for z80 is the way forward. I still need to know which variants are affected, so I know if the "safe" version has to become the default for some of z180, ez80_z80, z80n, r800, too.
        
        If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Przemek Sulikowski - 2024-04-02

also on page 2 (3-130) starting from
"Q: I don't seem to get correct state"
http://z80.info/zip/ZilogProductSpecsDatabook129-143.pdf

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Philipp Klaus Krause - 2024-04-02
  
  Hmm. According to that document, only NMOS Z80 should be affected.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Przemek Sulikowski - 2024-04-02
    
    Calindro - emulicious's developer- wrote that it also affect CMOS version of some Game Gear devices
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Janko Stamenović - 2024-04-02
    
    Just asked a person who was present in the discussions when Next's CPU was designed, apparently they decided to go with CMOS there, as then less code was needed for every check, and no known code depended on the older behavior.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Z80 Critical section implementation is faulty

The Small Device C Compiler (SDCC), targeting 8-bit architectures

Searches

Help

#3721 Z80 Critical section implementation is faulty

Related

Discussion

Related

Related