Menu

Q_ASSERT in publishFromISR

2024-05-14
2024-05-15
  • Sander Huijsen

    Sander Huijsen - 2024-05-14

    Hi!

    I ran into a rather obscure issue. I don't think I've seen this happen before.

    In the function publishFromISR, the following ASSERT triggered:

    // no need to lock the scheduler in the ISR context
    do { // loop over all subscribers
            // the prio of the AO must be registered with the framework
            uxSavedInterruptStatus = portSET_INTERRUPT_MASK_FROM_ISR();
            Q_ASSERT_INCRIT(510, registry_[p] != nullptr); // <<<<<-----
            portCLEAR_INTERRUPT_MASK_FROM_ISR(uxSavedInterruptStatus);
    

    (qf_port.cpp version 7.3.1)

    The comment "the prio of the AO must be registered with the framework' gives a clue as to what is wrong.

    I'm just wondering what might cause this to happen? What could the code do wrong that leads to this function getting called and asserting here? (Unfortunately, I have no more context. )

    Regards,
    Sander

     
  • Quantum Leaps

    Quantum Leaps - 2024-05-14

    Hi Sander,
    This is again this special (I call it bizzarre) design of FreeRTOS, where you have the duplicated "FromISR" API alongside the "normal" API. So, the first thing to check is whether publishFromISR() was indeed called from an ISR context (this includes FreeRTOS callbacks that run from ISRs). If the function was called from a task context, the ISR-level critical section used inside publishFromISR() might not work correctly, so the subscribe list might be changing. For example, the AO indicated by the priority p might have been "stopped".

    To start the investigation, I would run in a debugger and when the assertion fires, I would examine whether indeed the CPU is in the ISR context. You don't bother to mention which CPU you are running, so I can't tell how to check it. For ARM Cortex-M, you could examine the IPSR register, which should be 0 for task context and non-zero for ISR context.

    Next, I would check the p priority and see which AO this corresponds to. I always strongly discourage "stopping" AOs (the QActive::stop() API) because it is rather tricky to stop an AO cleanly. One of the resons is that the AO might still participate in some event exchanges (like event publishing) which might lead to disasters like you observe. So, one question to ask is whether your application trys to stop some AOs?

    --MMS

     

    Last edit: Quantum Leaps 2024-05-14
  • Sander Huijsen

    Sander Huijsen - 2024-05-15

    Hi Miro.

    Thanks for your comprehensive response.

    I'm using an STM32L452, which is a Cortex M4.

    Also, I never use QActive::stop() anywhere. I only ever start AOs and the only way they are stopped is from a power cycle/reset. In fact, the macro QACTIVE_CAN_STOP is undefined, so the code isn't even included.

    Thanks for the pointers. Looks like the problem may be even more obscure than I thought.

    Many thanks,
    Sander

     

Log in to post a comment.