I am confused as to why my performance falls off a cliff simply by
changing the "maximum disk size" value.
I'm running iometer 2006.07.27, windows front end (obviously) talking
to a dynamo on openSuSE 10.3. I've got a couple big XFS file systems.
The one I'm testing is a 3TB backed by a RAID0 on an Areca ARC-1280
RAID controller card. The RAID unit is a 6 500GB disk unit with a
64KB stripe size. The RAID card's 1GB cache is enabled in write back
mode for this test. The server is a multi-core box with 32GB of
memory, most of which is unused during this test.
# df -hT
Filesystem Type Size Used Avail Use% Mounted on
/dev/sdb xfs 2.8T 4.1G 2.8T 1% /data/b
I could go through all the settings I have, but I doubt most are
relevant to the question. Basically I am testing ONLY sequential
writes. I've got a single access specification assigned to perform
128KB sequential writes.
I had the test setup set to run normal cycling for 30 seconds.
I kept starting disk sector at 0, ran 16 outstanding I/Os per target,
and changed only the maximum disk size. Like I said earlier, I'm
running this test on XFS, so I was using the iobw.tst file. It was
created at a size of 4GB:
# ls -lh /data/b
total 4.0G
-rw------- 1 root root 4.0G 2007-12-14 11:37 iobw.tst
I ran `iostat -m -t -x 10 sdb` during the 30 second tests and took a
representative snapshot from the middle of each test. For all tests,
I changed ONLY the maximum disk size (MDS). Concentrate on the column
displaying wMB/s (write MB per sec). This number agrees with that
displayed by iometer. The results follow the end of this message.
Here are a couple questions:
1) Notice the huge performance drop between MDS of 10,000 and 100,000?
What could cause this? I don't understand how MDS is used by iometer
that would cause this behavior. For example, in the MDS = 100 case,
does iometer simply write the first 409,600 B (4KB file system block
size * 100) of the iobw.tst file repeatedly in 128KB chunks?
Likewise, MDS of 10,000 would be ~40MB and MDS of 100,000 would be
~400MB? I can't figure out why a 400MB seq. write should proceed at
1/5 the rate of a 40MB seq. write.
2) Are there utilities that will allow me to see the I/O locations
(LBA's) that are being read/written to on the device? Something that
would give me an idea of the pattern of writes in this case?
3) I notice that the avg request size (avgrq-sz) corresponds to the
request size which I selected (128KB, so the value 256 sectors (256 *
512 = 128KB) is correct). However, the avg queue depth (avgqu-sz) does
not correspond to the # outstanding I/Os I selected (16). Any idea
why?
Thanks for any tips,
Brett
iostat results follow (sorry for the formatting, iostat uses long
lines--I attached a file with all of this data to preserve
formatting):
MDS = 100:
avg-cpu: %user %nice %system %iowait %steal %idle
0.26 0.00 2.35 4.11 0.00 93.27
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await svctm %util
sdb 0.00 0.00 0.00 3935.46 0.00 491.93
256.00 0.73 0.19 0.18 72.69
MDS = 1,000:
avg-cpu: %user %nice %system %iowait %steal %idle
0.22 0.00 2.22 4.23 0.00 93.32
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await svctm %util
sdb 0.00 0.00 0.00 3991.91 0.00 498.99
256.00 0.73 0.18 0.18 72.73
MDS = 10,000:
avg-cpu: %user %nice %system %iowait %steal %idle
0.27 0.00 2.27 4.17 0.00 93.28
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await svctm %util
sdb 0.00 0.00 0.00 3958.40 0.00 494.80
256.00 0.72 0.18 0.18 72.40
MDS = 100,000:
avg-cpu: %user %nice %system %iowait %steal %idle
0.06 0.00 0.59 5.75 0.00 93.61
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await svctm %util
sdb 0.00 0.00 0.00 841.96 0.00 105.24
256.00 0.95 1.09 1.12 94.55
MDS = 1,000,000:
avg-cpu: %user %nice %system %iowait %steal %idle
0.07 0.00 0.61 5.78 0.00 93.54
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s
avgrq-sz avgqu-sz await svctm %util
sdb 0.00 0.00 0.00 852.50 0.00 106.56
256.00 0.95 1.11 1.11 94.88
|