|
From: Pascal B. <pas...@fr...> - 2012-07-18 17:55:29
|
Vlad, I kept digging my issue today, and I had the occasion to have a look at destination node's /var/log/messages file. In fact, for clarity purposes during eventual debug phases on my cluster nodes, I have setup an rsyslog config that redirects various cluster logs from /var/log/messages into dedicated files : drbd.log, scst.log, ra.log, etc... The files I sent yesterday were extracted from scst.log for instance. In normal times, the only entries I used to see in /var/log/messages were minor ntpd ones... But today, when I looked into it for any reason, I discovered very weird things that obviously look directly linked to the symptom we're facing here. It means that what I sent you last night is incomplete, and probably hardly usable to you as is. I've extracted the logs corresponding to the time frame of what was related in my 2 yesterday log files, please have a look into it and eventually let me know what you think about it, I've never seen this before. I swear it's not the result of a random character generator, nor an extract from Matrix. Hardware issue in the end ? If not, that would be a very ugly soft one!!! The best thing to me would be to temporarily redirect the SCST logs back again to /var/log/messages and retry a failing failover so that I can provide you really complete sequence of events, unfortunately the cluster is in production at the moment and my only test volume is currently used to host files that will allow me to empty other volumes and transition them from block dev files to regular xfs files. I may be able to do that tomorrow. Thanks for your patience and your help anyhow! Best regards, Pascal. -----Message d'origine----- De : Vladislav Bolkhovitin [mailto:vs...@vl...] Envoyé : mardi 17 juillet 2012 23:08 À : Pascal BERTON Cc : scs...@li... Objet : Re: [Scst-devel] SCST backend device activation problems : scst_translate_lun:FLAG SUSPENDED set, skipping Hi, It seems your config has some weird circular dependency, like: backend devices don't start working never completing received requests until SCST config done and SCST config can't finish waiting for the devices to complete. Logs from the beginning of failover can shed some light on this. Vlad Pascal BERTON, on 07/14/2012 08:26 AM wrote: > Hi all ! > > I'm currently facing weird problems with SCST, and after days of > various experiments and observations, trying to isolate as precisely > as possible the problem, I conclude that I now need a hand. Could > somebody help me a bit on that ? > > Basically, we're running a 2 nodes single-primary DRBD/Pacemaker > cluster (kernel version 2.6.32-71.7.1., based from Openfiler 2.99 > distro) hosting 4 DRBD resources each presented to 4 VMware hosts > (ESXi 4.1) using two SCST > (vdisk_fileio) and ISCSI-SCST targets (version 2.0.0.1 at first, now > in > 2.2.0 but the problem persists) per resource. Resources are spread > over the > 2 nodes, 3 active TB per node overall. DRBD replication link is a dual > 10GbE link bonded in LACP (mode 4). Volumes are hardware RAID5 made up > of 9 15krpm > 146 or 300GB SAS drives (I mean, disk IO perf doesn't seem to be in > cause) > > Basically, the issue is : Cluster starts resources, the 4 DRBD > primaries go up, then the 4 pairs of virtual IPs, then the SCST > services and things run fine. Until you try to migrate resources back > and forth. When you do that, it works once, twice, sometimes even 3 > times, but then you can see DRBD promoted correctly, then the IPs wake > up, but the SCST resource remains stuck down, running into timeout > after the configured 60s. At that moment, everything fails back to its > former place, as it should. If you try again, same story. In the end, > you obtain a cluster but the resource is stuck on a node, unable to > failover either manually or, more embarrassing, following a node crash (Which we inevitably faced recently, thanks Mr Murphy.). > > After digging the various logs, what I see is : > > - DRBD does its job 100% correctly > > - Pacemaker seems to do its job, with the resource it has, in the > state they are in. (I mean, the errors it mentions look normal errors > in the global failing context) > > - SCST starts its job, but hangs on the device handling section > (BTW, my RA agent uses the sysfs interface and is based on Patrick > Zwahlen's implementation that I customized a bit, mostly to add more > friendly tracing, and also to invert the order of activation : iSCSI > target first, then the backend device, instead of the reverse, > although I now doubt it has a real impact). Basically, all the iSCSI target setup stuff runs fine, but then : > > o Either it hangs on backend device creation > > o Or it hangs an LUN 0 assignment > >> From that point on, it hangs until the configured start timeout, and >> then > everybody goes back home, however. The backend device that refused to > get created correctly has been created and remains, eventually the > target directory and even sometimes the LUN0 directory in it too. From > that point on, it turns into a good mess, in fact problems start here. > After that, any migration try is doomed to failure! If I reboot the > node, it will accept a couple of migrations again, and then fail again in the same manner. |