Thanks for not giving up and finally being able to verify that it works as expected! Even though it has not yet been tested with an older version of nvidia-smi the changes to the code are so trivial that I expect to release version 1.3 later today. Thanks again!
Interesting! And the snmp daemon has been restarted since nvgpu-snmp.so was installed as you rebooted. However, you wrote something about uininstalling and reinstalling net-snmp, could it be that the running snmpd is using modules from some other directory than /usr/x86_64-linux-gnu/snmp/dlmod ? In what way did the compilation of nvgpu-snmp interfer with your snmp installation? During compilation, netsnmp-config is used to figure out where to install nvgpu-snmp.so, maybe that directory changes if...
I now realize that during "make install", the "install" command is called with the switch "-s" when installing nvgpu-snmp.so which strips the binary from some symbols, making it smaller. However, there is still one way to check that the new version is installed: strings /usr/x86_64-linux-gnu/snmp/dlmod/nvgpu-snmp.so | grep power The above command should show the new fields average_power_draw and instant_power_draw, if not, the installed nvgpu-snmp.so is some older version. If those fields does not...
Thanks! With that line in snmpd.conf you should not have nggpu_agentxd running as a separate process. Instead, please check that your /usr/x86_64-linux-gnu/snmp/dlmod/nvgpu-snmp.so has a size that matches your newly compiled nvgpu-snmp.so Also, just to make sure, check that the timestamp of /usr/x86_64-linux-gnu/snmp/dlmod/nvgpu-snmp.so is the same or newer than your newly compiled file.
In lack of any nvidia-smi I made a quick hack to my code to read the XML data provided by you instead of calling nvidia-smi. I then got the following output: tuxedo:~> snmpwalk -c public -v2c localhost 1.3.6.1.4.1.2021.13.42.2 UCD-SNMP-MIB::ucdExperimental.42.2.1.1.0 = INTEGER: 0 UCD-SNMP-MIB::ucdExperimental.42.2.1.1.1 = INTEGER: 1 UCD-SNMP-MIB::ucdExperimental.42.2.1.2.0 = STRING: "NVIDIA GeForce RTX 2060" UCD-SNMP-MIB::ucdExperimental.42.2.1.2.1 = STRING: "NVIDIA GeForce RTX 4060 Ti" UCD-SNMP-MIB::ucdExperimental.42.2.1.3.0...
Thanks for your attempt! I was hoping that my changes would help and use average_power_draw and instant_power_draw for your version of nvidia-smi. Just to double-check that the right version is running: ps -ef | grep nvgpu_agentxd | grep -v grep (just to make sure that only one such process is running), and then: ls -alH /proc/`ps -ef | grep nvgpu_agentxd | grep -v grep | awk '{print $2}'`/exe And make sure that the size and timestamp matches with your newly compiled binary. If size and timestamp...
Fine! I have pushed my changes now, but not been able to test anymore than to see that the code still compiles. I have also fixed another bug report about code no longer compiling with newer versions of net-snmp.. Please let me know if this version works better for you.
Thanks a lot for the xml data! Unfortunately I am not sure where to find any matching xml data in the file. Was your card used rather heavily when you did produce the file? If so, maybe the field "instant_power_draw" is the one supposed to be used now instead. That field says 144.88 W for one of your cards and 156.89 W for the other card. However those values are far higher than the 12 W in your first example output from nvidia-smi and 10 W in the example snmpwalk output. It seems as if average_power_draw...