Merged code from V5-8-patches (Nov 2018) running on Linux Debian Jessie based system is seeing
regular core dumps.
The problems appears to be a double free in snmplib/snmpusm.c: 1936 usm_free_usmStateReference(secStateRef)
Would an acceptable fix be to just check this pointer before freeing? I don't see how this is
can be triggered (since I can't recreate the problem).
syslogs and full backtrace is attached.
For some reason, I couldn't paste into chrome: Here's the syslog:
and the backtrace:
This looks very similar to a bug reported on RedHat systems:
https://bugzilla.redhat.com/show_bug.cgi?id=1663027
Hi Sam! Does requesting a small messageMaxSize cause this more often, like,
snmpbulkget --sendMessageMaxSize=%d -Cn%d -Cr%d ...
We had some problems in this area and I forget if this was the cause. I don't remember the details of when the crash happened for us, the regression test uses
for i in [ None, 65530, 32768, 32767, 9000, 1600, 1472 ] + range( 1400, 40, -40):
and I have a vague memory of it starting at 1472.
Bill
Yes, that helps a lot. I can crash it at will now. An snmpbulkget like this
will crash a recent (V5-8-patches as of dc3194eaecb4 Dec 28, 2019) debug build.
snmpbulkget -v3 -Cn1 -Cr1472 -l authPriv -u testuser -a SHA -A testshashasha -x AES -X testaesaesaes localhost 1.3.6.1.2.1.1.5 1.3.6.1.2.1.1.7
A full backtrace looks like the following:
It looks like the pdu->securityStateRef is never set correctly in this case.
At least all the up to frame 18:
deleted...this will never work.
Last edit: Sam Tannous 2019-02-15
Valgrind shows that memory is freed twice:
I don't quite understand how this can happen. It looks like the calloc address is slightly different then the two freed addresses.
Just noticed this " * FIX Memory leaks if secStateRef ..." I guess we need to fix this ;-)
Our analysis ended up being that this was an API flaw and the code added in
5.8 could never have worked - when rebuilding the pdu forward, we need the
info that was freed while trying to build the pdu backward. I think the
right fix is to move the functionality into a new function whose API is to
not free on failure, and use that when building if you think you might try
to rebuild the pdu forward.
Bill
On Fri, Feb 15, 2019 at 3:46 PM Sam Tannous stannous-cn@users.sourceforge.net wrote:
Related
Bugs:
#2923(v3doublefree2.patch is slightly better in that it creates an snmp_api function
snmp_free_securityStateRef() that handles only the securityStateRef freeing.)
This patch avoids the free on failures. It also avoids the double free problem if
free_securityStateRef() is done in snmp_free_pdu().
Instead, the free for pdu->securityStateRef is done in free_agent_snmp_session() just before
the pdu(s) are freed. As a bonus, it does not appear to leak memory ;-) I had to remove the
"static" from free_securityStateRef in snmp_api so I could call it from snmp_agent.c.
I'm not sure if this is the right thing to do here so comments are welcome.
--Sam Tannous
Last edit: Sam Tannous 2019-02-25
Version 3 of this patch has been applied on the v5.8 and master branches. Thanks for the patch!
Hi Bart / Sam,
There is a minor bug in this version 3 patch.
The variable usr_auth_protocol_length should be usr_priv_protocol_length.
Last edit: Ming Chen 2019-06-05
This patch has been checked in on the v5.8 and master branches. Thanks for the patch!
Thanks. Do you think we need to keep the condition style consistent with the original patch? Or update the orginal to yours. Please see the incline comment as below. But I'm OK if you don't want to make another change.
Last edit: Ming Chen 2019-06-06
It seems like I misread your patch. I have replaced the patch that had been pushed out by a version that uses usr_priv_protocol_length instead of usr_priv_protocol.
Thanks for your patience , Bart.
Also, there is another change for formatting the code, as part of the original patch doesn't comply with the net-snmp spacing style. Please see the attched patch file.
That patch has been applied. Thanks for the patch.
Sam, it would help a lot if a reproducer for this issue would be added to the Net-SNMP source tree in the form of a regression test. That will help to prevent that this issue gets reintroduced. Do you perhaps know how to reproduce this issue easily?
With a hint from Bill, see 4th comment above, I could reliably reproduce it with something like this:
My attempts so far to reproduce the double free with the following test were unsuccessful: