When a PD request is sent, the timeout supervision of the related subscriber is not restarted even though it should according to the standard (A.6.6.3).
...The issue occured in one of the TRDP Conformance Test cases. Here, the device under test would create a subscriber for a telegram with a 400 ms timeout and with a callback.
Now the main thread waits for a semaphore given by the callback after receiving 4 PD telegrams.
4 PD telegrams are sent from another device with a cycle time of 100 ms. Nothing is sent from the other device after that.
Once the main thread gets the semaphore, it waits for 200ms, sends a request, takes the “start time” and waits for about a second.
During that second, the subscriber’s callback is triggered again once the PD timeout occurs, and takes the “stop time” of the timeout interval.
Without High Performance mode, this interval is about 400ms, as it should be.
In High Performance mode, this interval will be different and not always the same. Sometimes it’s up to 600ms.
...
Reply:
"This is a tradeoff done between timely timeouts vs cpu-load at higher loads (~500+ subscribers). The downside of this approach is that the timeouts are not detected early. However, that is not the priority I think in HIGH_PERF_INDEX mode. All subscriptions need not be checked in all iterations ( this causes very high cpu load). In current devices, PD callbacks are not used and the data is only obtained through polling. While polling through tlp_get(), the user will always get the updated status (data received or timeout) of the subscriber."
Solution:
The fixed 100ms check cycle for telegrams with cycle times > 100ms can be reduced during compile time via CFLAGS += -DTRDP_TO_CHECK_CYCLE=50000.
See sample build config file LINUX_X86_64_HP_conform_config.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It appears that the bug is related to the High Performance mode.
Different behaviour in High Perf Mode.
...The issue occured in one of the TRDP Conformance Test cases. Here, the device under test would create a subscriber for a telegram with a 400 ms timeout and with a callback.
Now the main thread waits for a semaphore given by the callback after receiving 4 PD telegrams.
4 PD telegrams are sent from another device with a cycle time of 100 ms. Nothing is sent from the other device after that.
Once the main thread gets the semaphore, it waits for 200ms, sends a request, takes the “start time” and waits for about a second.
During that second, the subscriber’s callback is triggered again once the PD timeout occurs, and takes the “stop time” of the timeout interval.
Without High Performance mode, this interval is about 400ms, as it should be.
In High Performance mode, this interval will be different and not always the same. Sometimes it’s up to 600ms.
...
Reply:
"This is a tradeoff done between timely timeouts vs cpu-load at higher loads (~500+ subscribers). The downside of this approach is that the timeouts are not detected early. However, that is not the priority I think in HIGH_PERF_INDEX mode. All subscriptions need not be checked in all iterations ( this causes very high cpu load). In current devices, PD callbacks are not used and the data is only obtained through polling. While polling through tlp_get(), the user will always get the updated status (data received or timeout) of the subscriber."
Solution:
The fixed 100ms check cycle for telegrams with cycle times > 100ms can be reduced during compile time via CFLAGS += -DTRDP_TO_CHECK_CYCLE=50000.
See sample build config file LINUX_X86_64_HP_conform_config.