I've observed some performance degradation when writing large files. jfsCommit reaches nearly 100% cpu use and the write rate slows to a crawl. I've done a lot of digging, and I think I know the cause. In summary, I believe txUpdateMap is spending a *lot* of extra time re-writing the persistent block maps when extending the last xad in the file's xtree. xtLog stores the *entire last xad* for update, and when appending a large file on a sparsely allocated volume this xad might cover a lot of blocks. And this causes each commit to update millions of already allocated blocks. As the file grows the total number of updates grows exponentially causing huge performance loss.
The reason I investigated this was a 320GB file write slowed to about 7-10MB/s when the underlying RAID volume can usually write 400MB/s. jfsCommit cpu was 99.4% at the same time. For the record I'm running Linux 18.104.22.168-83.fc14.x86_64.
I confirmed the file was not fragmented by dumping out the xtree for the inode in question. (I used some hacky python code that I wrote for exploring JFS volumes, and I used sync and drop_caches to ensure metadata was flushed out to the disk).
Here's the xtree (4kb allocation size), and each tuple is (offset, address, length):
(0, 336166839, 1)
(1, 992337163, 2)
(3, 992398076, 1)
(4, 992471283, 22)
(26, 992771329, 13861631)
(13861657, 1026074340, 14113052)
(27974709, 1059503188, 14238636)
(42213345, 1093863162, 2)
(42213347, 1094345396, 175)
(42213522, 1094345577, 91)
(42213613, 1094588258, 12707998)
(54921611, 1115498095, 1)
(54921612, 1115498097, 2)
(54921614, 1115787154, 1)
(54921615, 1116292353, 102)
(54921717, 1116650670, 39713)
(54961430, 1116695739, 16777215)
(71738645, 1133472954, 6404161)
The performance issue occurred while writing the last 50 GB or so which is consistent with the last two XADs. When the file was around 266Gib long the last xad (at offset 54961430) would have had around 10M blocks. Every commit would have updated 10M blocks, until the xad overflowed and the new xad at offset 71738645 was created. And sure enough, when the file got to be around 293Gib the performance went up again but slowly fell again.
This should be reproducible by writing a large file on a large empty volume. As each xad fills the number of blocks being updated will increase and the total number of updates will grow exponentially causing the write to slow down. Once the xad overflows performance should go up again.