Menu

#255 Broker does not reconnect to restarted VM node

v.1.2.7
open
None
fc-brokerd
None
2013-06-24
2013-04-03
No

After a restart of the secondary master node fc-brokerd was unable to start dynamic VMs on that node.
The logs showed that fc-brokerd was unable to get the list of available/running VMs on that node.
A restart of fc-brokerd solved the problem.

Discussion

  • Christian Wittkowski

    Please attach a log file (preferably start the daemon with loglevel 7)

     
  • Christian Wittkowski

    Can it lead to problems if virConnectIsAlive() is called repeatedly to check if the connection still exists?

     
  • Tiziano Müller

    Tiziano Müller - 2013-04-05

    virConnectIsAlive() seems to be a simple ping function, so you can most probably use it at will.

    But can't you simply invalidate the connection handle in the "catch(VirtException &e)..." block to force a reconnect on the next run?
    Getting the number of domains running is a pretty basic command. Failing in that context most probably means that you lost the connection to the remote libvirt server and the easiest way to recover from that is to reconnect.

    You can then improve on that by following http://libvirt.org/guide/pdf/Application_Development_Guide.pdf and do a proper error handling in your catch-block using virGetLastError() which gives you a pointer to the current virError struct which in turn contains a member "code" which is a virErrorNumber and can be deciphered acording to http://libvirt.org/html/libvirt-virterror.html

     
  • Tiziano Müller

    Tiziano Müller - 2013-06-24

    Problem is still present in fc-brokerd 1.2.11:

    Jun 24 14:42:07 vmsrv01 fc-brokerd[15953]: INFO -------------- caught VirtException ---------
    Jun 24 14:42:07 vmsrv01 fc-brokerd[15953]: INFO Error Failed to get number of active domains from qemu+tcp://10.1.130.14/system
    Jun 24 14:42:07 vmsrv01 fc-brokerd[15953]: INFO ------------ try it in next loop ------------
    
     

Log in to post a comment.

MongoDB Logo MongoDB