I would appreciate a pair of expert eyes on the FreeRTOS timer implementation. Due to the timing-depedent nature of the issue, I have not been able to write a snippet to reliably reproduce the issue. However, someone should be able to follow my logic here.
How to reproduce:
NOTE: This is a race-condition, so it is timing-dependent.
What is expected:
A timer should never execute after the timer task has received tmrCOMMAND_STOP (and/or tmrCOMMAND_DELETE).
What is observed:
A timer can execute after being stopped and deleted, resulting in access to freed memory.
Annotated excerpt from log:
timer_expired: 0x3ffb6e6c # timer running normally
timer_cmd_send: 0x3ffb6e6c 0 4228 1
timer_cmd_receive: 0x3ffb6e6c 0 4228
timer_expired: 0x3ffb6e6c
timer_cmd_send: 0x3ffb6e6c 0 4238 1
timer_cmd_receive: 0x3ffb6e6c 0 4238
timer_expired: 0x3ffb6e6c
timer_cmd_send: 0x3ffb6e6c 3 0 1 # timer stop command sent
timer_cmd_send: 0x3ffb6e6c 0 4248 1
timer_cmd_send: 0x3ffb6e6c 5 0 1 # timer delete command sent
timer_cmd_receive: 0x3ffb6e6c 3 0 # timer stop command received
timer_cmd_receive: 0x3ffb6e6c 0 4248
timer_expired: 0x3ffb6e6c
timer_cmd_send: 0x3ffb6e6c 0 4258 1
timer_cmd_receive: 0x3ffb6e6c 5 0 # timer delete command received
timer_cmd_receive: 0x3ffb6e6c 0 4258
timer_expired: 0x3ffb6e6c
timer_cmd_send: 0x3ffb6e6c 0 4268 1 # timer still being reloaded ?!
timer_cmd_receive: 0x3ffb6e6c 0 4268
timer_expired: 0x3ffb6e6c
timer_cmd_send: 0x3ffb6e6c 0 4278 1
timer_cmd_receive: 0x3ffb6e6c 0 4278
Theory:
I suspect the issue has to do with the reload logic sending a START_DONT_TRACE command which ends up behind any STOP/DELETE commands on the queue. Then the STOP/DELETE occurs, and the old START_DONT_TRACE still gets executed, without checking that the timer has been deleted.
Note that, in the full log attached, you can see that I had already created and deleted a timer as an earlier part of a unit test, and the new timer was allocated the same handle address. This may be the true cause of the issue: after freeing (DELETE), the timer task does not flush the queue of all references to that timer. Some combination of deleting and re-creating may cause a race condition with the cached commands on the queue.
Version info:
I'm using FreeRTOS on ESP32. The exact implementation of FreeRTOS is as of the following commit in ESP-IDF: https://github.com/espressif/esp-idf/tree/92c469b5993b001a21970f0e58b25acd3f375807/components/freertos
However, the timer implementation seems minimally changed in FreeRTOS 10, so I anticipate the above described issue to still be present.
Hi Chris,
Thanks for reporting and the interest in FreeRTOS!
I am currently investigating the issue and I do have a few questions.
1- Are you ovserving access to freed memory?
2- Did the timer execute after the delete and stop have been executed or just inserted in the queue?
3- Is the task inserting in the queue higher priority than the task processing the timer queue?
4- Any other tasks are running at the same time with higher priority than the timer queue ?
Thanks,
Alfred
Hi, We will be closing the issue as no reponse for 12 days
I will post some clarifications about the issue.
The timer commands are reciegved in a queue which are processed one after the other in addition to the timer list and executing the timer callbacks , using a single thread.
if a thread is going through the list of timers and about to execute "Timer A" , and a higher priority task queues a command in the queue (say a stop and delete commands for "timer A")
"Timer A" will still be executed.
When the Task finishes traversing the list of timers, it will go through the queue of commands and remove the timer. and only after that it wont execute again. This is by design and is not a bug with FreeRTOS.
As a workaround, If there is a need to kill the timer immediately after sending a Stop/Delete command.
A global variable could be used as a signal in the timer function. so that it could be set right before enqueuing the Stop/Delete command
Then check when the timer callback is executed, and exit the function if the variable is set. (dont forget to reset it right before exiting)
Caution: If the Issue consists of accessing a previously deleted/freed memory, then it would definitely be a bug, and the issue should be reopened.
Thanks,
Alfred