|
From: Bart V. A. <bva...@ac...> - 2011-05-25 16:51:13
|
On Wed, May 25, 2011 at 3:38 PM, piotr nowat <now...@ho...>wrote: > > I have been testing the DRBD + SCST + Other targets problem further and see > results which seem to be isolated to SCST. > > > > Attached is a diagram which shows the basic setup and the targets used. In > summary > > > > SCST in blockio mode does not seem to be compatible with DRBD (I don’t know > who’s responsibly this is) but other targets (IET / LIO) seem to have no > problem. > > > > It is not related to the concurrent threads which have been mentioned, > there are NO drbd errors and I have even restricted traffic to single 512 > byte blocks. all relevant settings the same (queue depths, threads etc) > > The fileio modes do work – but will simply cause data corruption rendering > them pointless in active-active > One has to be VERY careful when using vdisk_fileio in combination with multi-path I/O and DRBD in active-active mode. It is very easy to cause data corruption with such a setup. One has to make sure that SCST only confirms a write as finished to the multipath initiator after the data involved in write operation has been passed on to DRBD on both nodes. Hence, DRBD protocol A would be a bad choice in this case - protocol C as you have chosen should be fine though. You will have to set both *write_through=1*and *nv_cache=0* in scst.conf (this combination of flags enables O_SYNC - I haven't yet tested this combination myself though). I'm not sure whether setting *o_direct=1* in scst.conf in addition to the already mentioned flags would help performance in this case. The problem is simply that DRBD will not sync SCST Blockio traffic without > a disconnect/reconnect. Even then it will only sync the then outdated data. > Any new writes which happen during this sync process will not be sync. So > after the trigger sync has finished you will still be left with X outdated > and have to disconnect / reconnect again. > > > > Tests on local level using dd and fio (with even the most aggressive tests > / async) work fine. > > Tests on IET / LIO with high or low queue depths work fine > > > > The default system is SLES11 with DRBD version 8.3.10 (tried others). > > I tried SUSE 11.4 but although others seemed to work SCST CORE DUMPED the > moment it wrote to SCST (reported) > > > > I initial presumed that this could be some configuration, but this would > seem less likely now maybe? > I haven't been able to reproduce this behavior with kernel 2.6.38.7 and built-in DRBD version 8.3.9. I'll see whether I can reproduce this with one of the kernels you mentioned above. Bart. |