From: John W. <jl...@sc...> - 2011-07-14 22:39:57
|
I am continuing with the port of open-vm-tools for OpenServer 6.0 (using the 2011.04.24-402641 release for the moment)with: --disable-unity The vmtoolsd & guestInfo plugin were providing the FQDN, IP address, INFO_OS_NAME="openServer6" and NicInfo V2 or V3 to the vSphere 4.x hosts. I was unable to get a heartbeat, however, so I took a little closer look at the rpcIn side of the RpcChannel and discovered what looks like a severe configuration and build problem. lib/rpcIn/rpcin.c & rpcin.h have 2 forms of the functions: RpcIn_Construct() RpcIn_Start() with a different number of parameters and parameter type. The selection is controlled by the define VMTOOLS_USE_GLIB. The configure script adds this define to the VMTOOLS_CPPFLAGS and subsequently to the PLUGIN_CPPFLAGS. lib/rpcChannel/bdoorChannel.c specifically uses the VMTOOLS_USE_GLIB form of the 2 functions listed above and the Makefile.am in that directory specifies that VMTOOLS_CPPFLAGS are to be added to the library specific CPPFLAGS. lib/rpcIn contains no similar specification in the Makefile.am and as a result, rpcin.c is compiled WITHOUT the VMTOOLS_USE_GLIB define and thus the wrong form of the functions are compiled. Once the define was added to the Makefile, the heartbeat was seen by the ESX host and marked with status "green" in the /var/log/vmware/hostd.log. What I cannot fathom is that the build problem has existed in the open-vm-tools releases for a long long time. - in the current release - 2011.06.27-437995 - existed in the 2010.03.20-243334 release - appears to have existed in the 2009.09.18-193784 release So how has this worked in any build, on any platform?? - UNLESS, all builds have been forcing the VMTOOLS_USE_GLIB define on CFLAGS or CPPFLAGS prior to running the configure command. - OR is there another mechanism to establish a VM heartbeat that I have yet to activate? Obviously I am somewhat baffled. As to a source tree fix, I am assuming that structuring the lib/rpcIn/Makefile.am like lib/rpcChannel/Makefile.in. I have not tried that since I currently have automake 1.10 on my system. For the time being I have added VMTOOLS_CPPFLAGS to the compile commands in the Makefile.in. -- John Wolfe UnXis, Inc. |
From: Marcelo V. <mv...@vm...> - 2011-07-14 23:03:45
|
Hi John, On 07/14/2011 03:36 PM, John Wolfe wrote: > lib/rpcIn/rpcin.c & rpcin.h have 2 forms of the functions: > > RpcIn_Construct() > RpcIn_Start() > > with a different number of parameters and parameter type. The > selection is controlled by the define VMTOOLS_USE_GLIB. > lib/rpcIn contains no similar specification in the Makefile.am > and as a result, rpcin.c is compiled WITHOUT the VMTOOLS_USE_GLIB > define and thus the wrong form of the functions are compiled. rpcin.c is currently compiled twice in open-vm-tools: once when compiling lib/rpcIn, the second time when compiling libvmtools (it's listed in the source files list in libvmtools/Makefile.am). The second time, VMTOOLS_USE_GLIB is defined. Since there's no more code that uses the non-glib version in open-vm-tools, I have already changed this in our internal source tree so it's only done once. But even in the current released sources for open-vm-tools, the makefiles should be doing the right thing. As for why this seems to influence heartbeats in your case, I don't know. The heartbeat is not an explicit message, it's recorded implicitly by just polling the backdoor, which is what the rpcIn library does. So if vmtoolsd is running and running the rpcIn loop, the heartbeat should be updated. -- - Marcelo |
From: John W. <jl...@sc...> - 2011-07-18 17:31:42
|
Marcelo, Ahhh, I missed the rpcin.c recompilation in the libvmtools Makefile. Thanks for setting me straight. Given that libvmtools should be and is built correctly, I reviewed the /var/log/vmware/hostd.log file containing the history of my initial testing. The log does show the heartbeat with status "green" and then "red" following termination of the vmtoolsd. In the past, with our OpenServer 5.0.7 VMware VM (using Mar 2010 open-vm-tools), the ESX host HA reset would occur when: - the kernel was panic'ed - the kernel debugger was started, suspending all processes - the vmtoolsd daemon was killed. apparently on the loss of a heartbeat. For my OpenServer 6 vmtools port, I was testing for the heartbeat or more specifically, the loss of a heartbeat by killing the vmtoolsd - expecting HA to reset the VM. The reset was not occurring and I assumed that the heartbeat had never been seen. On further testing with OSR 6, the HA reset is happening after the heartbeat changes status from green to red (killing vmtoolsd) and the system comes to a more quiescent state such as "init 0" but with no power-down. That suggests that there is some criteria beyond "loss of heartbeat" necessary to trigger an HA reset/reboot. Can you shed any information on this or point me to a VMware KB article, whitepaper or document that would provide further insight? Thanks, -- John Wolfe Unxis, Inc. |
From: Marcelo V. <mv...@vm...> - 2011-07-18 21:19:32
|
Hi John, Unfortunately I'm not very familiar with our HA feature. I've forwarded your e-mail internally to the appropriate team, let's see if they have any information to share. On 07/18/2011 10:30 AM, John Wolfe wrote: > That suggests that there is some criteria beyond "loss of heartbeat" > necessary to trigger an HA reset/reboot. Can you shed any information > on this or point me to a VMware KB article, whitepaper or document > that would provide further insight? -- - Marcelo |
From: Marcelo V. <mv...@vm...> - 2011-07-18 21:30:21
|
Hi John, The word I got is that if the VM is doing any I/O (disk or network), then HA will not reset it. So just the Tools heartbeat going red is not enough for HA to kick in. Hope this helps, On 07/18/2011 10:30 AM, John Wolfe wrote: > That suggests that there is some criteria beyond "loss of heartbeat" > necessary to trigger an HA reset/reboot. Can you shed any information > on this or point me to a VMware KB article, whitepaper or document > that would provide further insight? -- - Marcelo |