Calling AbstractSnmpContext.destroy() while a Transmitter is sending a Pdu can cause a deadlock:
Thread 1:
Name: 192.168.70.164_161_null_v0_Trans0
State: BLOCKED on uk.co.westhawk.snmp.stack.SnmpContext@259478 owned by: SnmpCollector
Total blocked: 1 Total waited: 0
Stack trace:
uk.co.westhawk.snmp.stack.AbstractSnmpContext.sendPacket(AbstractSnmpContext.java:612)
uk.co.westhawk.snmp.stack.Pdu.sendme(Pdu.java:256)
uk.co.westhawk.snmp.stack.Pdu.transmit(Pdu.java:681)
uk.co.westhawk.snmp.stack.Pdu.transmit(Pdu.java:669)
uk.co.westhawk.snmp.stack.Transmitter.run(Transmitter.java:80)
- locked uk.co.westhawk.snmp.stack.Transmitter@ff9a49
java.lang.Thread.run(Unknown Source)
Thread 2:
Name: SnmpCollector
State: BLOCKED on uk.co.westhawk.snmp.stack.Transmitter@ff9a49 owned by: 192.168.70.164_161_null_v0_Trans0
Total blocked: 25.307 Total waited: 24.732
Stack trace:
uk.co.westhawk.snmp.stack.Transmitter.stand(Transmitter.java:144)
uk.co.westhawk.snmp.stack.Transmitter.destroy(Transmitter.java:164)
uk.co.westhawk.snmp.stack.AbstractSnmpContext.freeTransmitters(AbstractSnmpContext.java:922)
uk.co.westhawk.snmp.stack.AbstractSnmpContext.destroy(AbstractSnmpContext.java:457)
- locked uk.co.westhawk.snmp.stack.SnmpContext@259478
If someone interested in the fix I have made one for myself (sorry I can't make the proper patch) and I'm still testing it.
There are changes in 2 methods in class:
uk.co.westhawk.snmp.stack.Transmitter
Last edit: Zur13 2018-12-04
Thanks for this, if you could supply a diff, that would be great.
Tim.
Related
Bugs: #1
I'm still running the tests but while this fixes the deadlock there is the hidden issue with the lib architecture in there related to this deadlock. This deadlock happens in the case when the context was closed before the PDU was sent so if the deadlock will be fixed there will still be the issue: in some cases some PDUs will not be sent to devices due to race condition because the required context is destroyed before the actual data send happens.
Last edit: Zur13 2018-12-04
I suppose you could delay the destruction of the context until all pending PDUs are sent.
That however might leave you with a problem on messages that expect an ack - you could be
waiting a while for destroy to happen.
You could have a destroyWhenIdle() type method - which fires a callback when complete
T.
Related
Bugs: #1
Ok, here is possible patch but I did't test it at all. Transmitter is changed so it calls pdu.transmit() outside of synchronize block so the transmit sequence don't capture the Transmitter instance monitor. The AbstractSnmpContext.destroy() is changed so it uses new flag which prevents new PDUs to be added tothe context after the destroy is called and awaits for known PDUs to get answers.