Menu

PubSub missing messages

2020-04-18
2020-04-19
  • Imre Szebelledi

    Imre Szebelledi - 2020-04-18

    Hello Everyone,

    I have just started using the ZMQ binding and have noticed something with this test setup:
    I publish 10000 message in a for loop at max speed (no wait) but on the receiving side I do not receive all 10k messages, some of them are missing, but if I introduce a very small delay, then there is no missing messages. Do any of you have an idea why is this happening? Thank you in advance!
    -Imre

     
  • Martijn Jasperse

    My guess is that without the delay, the messages queue up faster than they are removed, so the number of messages is exceeding the high-water mark (HWM) and some messages are silently dropped. By default this is 1000 messages but you can increase it with the setsockopt VI. It would be worth considering whether this is a realistic sceanrio though - do you really expect to have several thousand messages in the queue at once? I would suggest that's evidence of needing to change your protocol as opposed to raising the HWM.

    It may even simply be a question of thread switching in your stress test, where putting the timer in allows the receiver to execute, regardless of how small the delay actually is.

     
  • Imre Szebelledi

    Imre Szebelledi - 2020-04-19

    Dear Martin,

    thank you very much for the quick answer! I would like to apologize up front because I am still just learning ZMQ so there are quite a lot that I don't understand yet. The stress test maybe is not a realistic situation, I was just concerned by this because there is a possibility that I am missing something fundamental. According to your suggestions I tried to make some modifications like increasing the HWM and forcing the receiver to execute after the sender is finished with sending, even forcing it to a different thread by inserting it into a timed loop but there are still missing messages. :(

     

    Last edit: Imre Szebelledi 2020-04-19
  • Martijn Jasperse

    The SNDHWM and RCVHWM are different - make sure to set the RCVHWM on the receiver. I also recommend setting any socket options before bind or connect to ensure they propagate correctly.

    I suspect the timed loop on the receiver will make it worse because 1kHz is very slow. An extremely short delay in the sender is a better approach - the earlier example you had used a delay of 1us which is very quick, but sufficient to do the job.

     
    • Imre Szebelledi

      Imre Szebelledi - 2020-04-19

      I just want to let you know that by setting the socket option before bind did solve the problem. Thank you very much for your help!

       
  • Imre Szebelledi

    Imre Szebelledi - 2020-04-19

    Yeah I have noticed the error I made, but unfortuantely it does not make a difference. I set the linger time to a long one so that the slow timed loop would have time to finish receiving the messages. My idea was that finish quickly with the sending of the 10k message, start receiving it slowly and that's it. Unfortunately it is the same situation: a few hundred message are lost. Despite this I think it is an awesome library and I plan to use it with our RT systems as well, as I have seen some here could successfully build it to RT OS.

     
  • Martijn Jasperse

    Thanks for letting me know it works. The API documentation suggests it should be possible to set the HWMs after bind/connect, but from my experience it is best to do beforehand and ensure that the changes take effect immediately.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.