I have just started using the ZMQ binding and have noticed something with this test setup:
I publish 10000 message in a for loop at max speed (no wait) but on the receiving side I do not receive all 10k messages, some of them are missing, but if I introduce a very small delay, then there is no missing messages. Do any of you have an idea why is this happening? Thank you in advance!
-Imre
My guess is that without the delay, the messages queue up faster than they are removed, so the number of messages is exceeding the high-water mark (HWM) and some messages are silently dropped. By default this is 1000 messages but you can increase it with the setsockopt VI. It would be worth considering whether this is a realistic sceanrio though - do you really expect to have several thousand messages in the queue at once? I would suggest that's evidence of needing to change your protocol as opposed to raising the HWM.
It may even simply be a question of thread switching in your stress test, where putting the timer in allows the receiver to execute, regardless of how small the delay actually is.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
thank you very much for the quick answer! I would like to apologize up front because I am still just learning ZMQ so there are quite a lot that I don't understand yet. The stress test maybe is not a realistic situation, I was just concerned by this because there is a possibility that I am missing something fundamental. According to your suggestions I tried to make some modifications like increasing the HWM and forcing the receiver to execute after the sender is finished with sending, even forcing it to a different thread by inserting it into a timed loop but there are still missing messages. :(
The SNDHWM and RCVHWM are different - make sure to set the RCVHWM on the receiver. I also recommend setting any socket options before bind or connect to ensure they propagate correctly.
I suspect the timed loop on the receiver will make it worse because 1kHz is very slow. An extremely short delay in the sender is a better approach - the earlier example you had used a delay of 1us which is very quick, but sufficient to do the job.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yeah I have noticed the error I made, but unfortuantely it does not make a difference. I set the linger time to a long one so that the slow timed loop would have time to finish receiving the messages. My idea was that finish quickly with the sending of the 10k message, start receiving it slowly and that's it. Unfortunately it is the same situation: a few hundred message are lost. Despite this I think it is an awesome library and I plan to use it with our RT systems as well, as I have seen some here could successfully build it to RT OS.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for letting me know it works. The API documentation suggests it should be possible to set the HWMs after bind/connect, but from my experience it is best to do beforehand and ensure that the changes take effect immediately.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello Everyone,
I have just started using the ZMQ binding and have noticed something with this test setup:
I publish 10000 message in a for loop at max speed (no wait) but on the receiving side I do not receive all 10k messages, some of them are missing, but if I introduce a very small delay, then there is no missing messages. Do any of you have an idea why is this happening? Thank you in advance!
-Imre
My guess is that without the delay, the messages queue up faster than they are removed, so the number of messages is exceeding the high-water mark (HWM) and some messages are silently dropped. By default this is 1000 messages but you can increase it with the setsockopt VI. It would be worth considering whether this is a realistic sceanrio though - do you really expect to have several thousand messages in the queue at once? I would suggest that's evidence of needing to change your protocol as opposed to raising the HWM.
It may even simply be a question of thread switching in your stress test, where putting the timer in allows the receiver to execute, regardless of how small the delay actually is.
Dear Martin,
thank you very much for the quick answer! I would like to apologize up front because I am still just learning ZMQ so there are quite a lot that I don't understand yet. The stress test maybe is not a realistic situation, I was just concerned by this because there is a possibility that I am missing something fundamental. According to your suggestions I tried to make some modifications like increasing the HWM and forcing the receiver to execute after the sender is finished with sending, even forcing it to a different thread by inserting it into a timed loop but there are still missing messages. :(
Last edit: Imre Szebelledi 2020-04-19
The SNDHWM and RCVHWM are different - make sure to set the RCVHWM on the receiver. I also recommend setting any socket options before bind or connect to ensure they propagate correctly.
I suspect the timed loop on the receiver will make it worse because 1kHz is very slow. An extremely short delay in the sender is a better approach - the earlier example you had used a delay of 1us which is very quick, but sufficient to do the job.
I just want to let you know that by setting the socket option before bind did solve the problem. Thank you very much for your help!
Yeah I have noticed the error I made, but unfortuantely it does not make a difference. I set the linger time to a long one so that the slow timed loop would have time to finish receiving the messages. My idea was that finish quickly with the sending of the 10k message, start receiving it slowly and that's it. Unfortunately it is the same situation: a few hundred message are lost. Despite this I think it is an awesome library and I plan to use it with our RT systems as well, as I have seen some here could successfully build it to RT OS.
Thanks for letting me know it works. The API documentation suggests it should be possible to set the HWMs after bind/connect, but from my experience it is best to do beforehand and ensure that the changes take effect immediately.