Hello,

Corrupt JFS root nodes continue to plague my Debian 5 based servers running 2.6.26 and 2.6.32 kernels... This is a big problem for us so I spent last week learning enough about JFS to create a test script which reproduces the problems -- I'm hoping this is enough for the folks on this list to fix the bug :-) I've emailed this list about this problem before.. For the old thread, see: 

    Corrupt JFS root nodes on volumes with 500+ top level directories

My test platform is a fully patched clean install of Debian 5.0.8 on a VMWare host. To reproduce the problem:

1) Format a drive (/dev/sdb) volume with JFS (volume size does not matter)
2) In a loop, create one permanent file and create and delete a temporary file (tmp) 10 times
3) Exit the loop when jfs_fsck reports an error

Here's a Bash script that will do the above and corrupts a JFS file system in just a few minutes:

file_count=0
mount_dir="/mnt"
device="/dev/sdb"

jfs_mkfs -f $device

while jfs_fsck -n $device > /dev/null 2>&1; do
    mount $device $mount_dir

    touch "$mount_dir/$(printf '%08d' $file_count)"
    let file_count+=1

    

    for (( i=0; i<10; i+=1 )); do
        touch $mount_dir/tmp
        rm $mount_dir/tmp
    done

    

    umount $device
done

echo -e "\nfile_count=$file_count\n"
jfs_fsck -n -v $device

Every time this script is run it dies after creating exactly 984 files. Here's the output from the above script. 

jfs_mkfs version 1.1.14, 06-Apr-2009

Format completed successfully.

20971520 kilobytes total disk space.

file_count=984

sb_ptr = 0x677880   agg_recptr = 0x672120   bmap_recptr = 0x677280
jfs_fsck version 1.1.14, 06-Apr-2009
processing started: 2/6/2011 22.33.49
The current device is:  /dev/sdb
Open(...READONLY...) returned rc = 0
Primary superblock is valid.
The type of file system for the device is JFS.
Block size in bytes:  4096
Filesystem size in blocks:  5242880
**Phase 1 - Check Blocks, Files/Directories, and  Directory Entries
bad obj size: fs ino: 2(t)  maxsize = 32768(t)  di_size = 90112(t)
Invalid data format detected in root directory.
CANNOT CONTINUE.
ERRORS HAVE BEEN DETECTED.  Run fsck with the -f parameter to repair.
processing terminated:  2/6/2011 22:33:49  with return code: 10062  exit code: 4.

The debug error message above comes from fsck/fsckmeta.c...

                        /* the data size (in bytes) must not exceed the total size
                         * of the blocks allocated for it.
                         * Blocks allocated for directory index table
                         * make minimum size checking inconclusive
                         */
                        if (agg_recptr->this_inode.data_size == 0) {
                                max_size = IDATASIZE;
                        } else {
                                /* blocks are allocated to data */
                                max_size = agg_recptr->this_inode.data_size;
                        }
                        if (inoptr->di_size > max_size) {
                                /*
                                 * object size (in bytes) is wrong.
                                 * tree must be bad.
                                 */
#ifdef _JFS_DEBUG
                                printf("bad obj size: fs ino: %ld(t)  "
                                       "maxsize = %lld(t)  di_size = %lld(t)\n",
                                       inoidx, max_size, inoptr->di_size);
#endif
                                bad_size = -1;
                        }

Here some output from jfs_debugfs... I'm not an expert in JFS internals so I just explored the records that I thought were most relevant. I'm happy to provide more detail if someone let's know what commands to run..

jfs_debugfs /dev/sdb
jfs_debugfs version 1.1.14, 06-Apr-2009
G
Aggregate Block Size: 4096

> su p
[1] s_magic:
'JFS1' [15] s_ait2.addr1: 0x00
[2] s_version:
1 [16] s_ait2.addr2: 0x00000295
[3] s_size:
0x00000000027d7968      s_ait2.address: 661
[4] s_bsize:
4096 [17] s_logdev: 0x00000810
[5] s_l2bsize:
12 [18] s_logserial: 0x000003d8
[6] s_l2bfactor:
3 [19] s_logpxd.len: 20480
[7] s_pbsize:
512 [20] s_logpxd.addr1: 0x00
[8] s_l2pbsize:
9 [21] s_logpxd.addr2: 0x004fb000
[9] pad:
Not Displayed      s_logpxd.address: 5222400
[10] s_agsize:
0x00010000 [22] s_fsckpxd.len: 211
[11] s_flag:
0x10200900 [23] s_fsckpxd.addr1: 0x00
                JFS_LINUX [24] s_fsckpxd.addr2: 0x004faf2d
JFS_COMMIT JFS_GROUPCOMMIT      s_fsckpxd.address: 5222189
               JFS_INLINELOG [25] s_time.tv_sec: 0x4d4f91a5
                    [26] s_time.tv_nsec: 0x00000000
                    [27] s_fpack: ''
[12] s_state:
0x00000000
     FM_CLEAN
[13] s_compress:
0
[14] s_ait2.len:
4

display_super: [m]odify or e[x]it: x

> i 16 a
Inode 16 at block 13, offset 0x0:

[1] di_inostamp:
0x4d4f91a5 [19] di_mtime.tv_nsec: 0x00000000
[2] di_fileset:
1 [20] di_otime.tv_sec: 0x4d4f91a5
[3] di_number:
16 [21] di_otime.tv_nsec: 0x00000000
[4] di_gen:
1 [22] di_acl.flag: 0x00
[5] di_ixpxd.len:
4 [23] di_acl.rsrvd: Not Displayed
[6] di_ixpxd.addr1:
0x00 [24] di_acl.size: 0x00000000
[7] di_ixpxd.addr2:
0x0000000b [25] di_acl.len: 0
     di_ixpxd.address:
11 [26] di_acl.addr1: 0x00
[8] di_size:
0x0000000000002000 [27] di_acl.addr2: 0x00000000
[9] di_nblocks:
0x0000000000000002      di_acl.address: 0
[10] di_nlink:
1 [28] di_ea.flag: 0x00
[11] di_uid:
0 [29] di_ea.rsrvd: Not Displayed
[12] di_gid:
0 [30] di_ea.size: 0x00000000
[13] di_mode:
0x00018000 [31] di_ea.len: 0
0100000       ---- [32] di_ea.addr1: 0x00
[14] di_atime.tv_sec:
0x4d4f91a5 [33] di_ea.addr2: 0x00000000
[15] di_atime.tv_nsec:
0x00000000      di_ea.address: 0
[16] di_ctime.tv_sec:
0x4d4f91a5 [34] di_next_index: 2
[17] di_ctime.tv_nsec:
0x00000000 [35] di_acltype: 0x00000000
[18] di_mtime.tv_sec:
0x4d4f91a5

> i 2
Inode 2 at block 665, offset 0x400:

[1] di_inostamp:
0x4d4f91a5 [19] di_mtime.tv_nsec: 0x24ab1db7
[2] di_fileset:
16 [20] di_otime.tv_sec: 0x4d4f91a5
[3] di_number:
2 [21] di_otime.tv_nsec: 0x00000000
[4] di_gen:
1 [22] di_acl.flag: 0x00
[5] di_ixpxd.len:
4 [23] di_acl.rsrvd: Not Displayed
[6] di_ixpxd.addr1:
0x00 [24] di_acl.size: 0x00000000
[7] di_ixpxd.addr2:
0x00000299 [25] di_acl.len: 0
     di_ixpxd.address:
665 [26] di_acl.addr1: 0x00
[8] di_size:
0x0000000000016000 [27] di_acl.addr2: 0x00000000
[9] di_nblocks:
0x0000000000000020      di_acl.address: 0
[10] di_nlink:
2 [28] di_ea.flag: 0x00
[11] di_uid:
0 [29] di_ea.rsrvd: Not Displayed
[12] di_gid:
0 [30] di_ea.size: 0x00000000
[13] di_mode:
0x000141ed [31] di_ea.len: 0
0040755       drwx [32] di_ea.addr1: 0x00
[14] di_atime.tv_sec:
0x4d4f91a5 [33] di_ea.addr2: 0x00000000
[15] di_atime.tv_nsec:
0x00000000      di_ea.address: 0
[16] di_ctime.tv_sec:
0x4d4f924d [34] di_next_index: 10826
[17] di_ctime.tv_nsec:
0x24ab1db7 [35] di_acltype: 0x00000000
[18] di_mtime.tv_sec:
0x4d4f924d

> dir 2
idotdot = 2

4
00000000
5
00000001
6
00000002
7
00000003
8
00000004
9
00000005
10
00000006
11
00000007
.
.
.
29 00000025
30
00000026
31
00000027
1024
00000028
1025
00000029
1026
00000030
1027
00000031
.
.
.
1976 00000980
1977
00000981
1978
00000982
1979
00000983

> dt 2
Root D-Tree Node of inode 2

[1] DASDlimit
0
[2] DASDused
0
[3] thresh (%)
0
[4] delta (%)
0

[5] flag
0x85 BT_ROOT  BT_INTERNAL  
[6] nextindex
1
[7] freecnt
7
[8] freelist
2
[9] idotdot
2
[10] stbl
{1,2,3,4,5,6,7,8}
dtree: Hit enter to see entries, [m]odify, or e[x]it: 
stbl[0] = 1
[1] xd.len
    0x000001 [4] next -1
[2] xd.addr1
  0x00 [5] namlen 0
[3] xd.addr2
  0x00010020      xd.addr 65568
[6] name

addressPXD(xd)
65568
dtree: press enter for next or [u]p, [d]own or e[x]it > d
Internal D-tree node at block 65568
[1] flag
0x04 BT_INTERNAL  
[2] nextindex
  8
[3] freecnt
115 [7] rsrvd NOT DISPLAYED
[4] freelist
 13 [8] self.len 0x000001
[5] maxslot
128 [8] self.addr1 0x00
[6] stblindex
9 [9]  0x00010020
dtree: Hit enter to see entries, [u]p or e[x]it: 
stbl[0] = 1
[1] xd.len
    0x000001 [4] next -1
[2] xd.addr1
  0x00 [5] namlen 0
[3] xd.addr2
  0x00010001      xd.addr 65537
[6] name

addressPXD(xd)
65537
dtree: press enter for next or [u]p, [d]own or e[x]it > 
stbl[1] = 2
[1] xd.len
    0x000001 [4] next -1
[2] xd.addr1
  0x00 [5] namlen 8
[3] xd.addr2
  0x00010004      xd.addr 65540
[6] name
00000123
addressPXD(xd)
65540
dtree: press enter for next or [u]p, [d]own or e[x]it > 
stbl[2] = 3
[1] xd.len
    0x000001 [4] next -1
[2] xd.addr1
  0x00 [5] namlen 8
[3] xd.addr2
  0x00010009      xd.addr 65545
[6] name
00000246
addressPXD(xd)
65545
dtree: press enter for next or [u]p, [d]own or e[x]it > 
stbl[3] = 4
[1] xd.len
    0x000001 [4] next -1
[2] xd.addr1
  0x00 [5] namlen 8
[3] xd.addr2
  0x0001000c      xd.addr 65548
[6] name
00000369
addressPXD(xd)
65548
dtree: press enter for next or [u]p, [d]own or e[x]it > 
stbl[4] = 5
[1] xd.len
    0x000001 [4] next -1
[2] xd.addr1
  0x00 [5] namlen 8
[3] xd.addr2
  0x00010010      xd.addr 65552
[6] name
00000492
addressPXD(xd)
65552
dtree: press enter for next or [u]p, [d]own or e[x]it > 
stbl[5] = 6
[1] xd.len
    0x000001 [4] next -1
[2] xd.addr1
  0x00 [5] namlen 8
[3] xd.addr2
  0x00010014      xd.addr 65556
[6] name
00000615
addressPXD(xd)
65556
dtree: press enter for next or [u]p, [d]own or e[x]it > 
stbl[6] = 7
[1] xd.len
    0x000001 [4] next -1
[2] xd.addr1
  0x00 [5] namlen 8
[3] xd.addr2
  0x00010017      xd.addr 65559
[6] name
00000738
addressPXD(xd)
65559
dtree: press enter for next or [u]p, [d]own or e[x]it > 
stbl[7] = 8
[1] xd.len
    0x000001 [4] next -1
[2] xd.addr1
  0x00 [5] namlen 8
[3] xd.addr2
  0x0001001b      xd.addr 65563
[6] name
00000861
addressPXD(xd)
65563
dtree: press enter for next or [u]p, [d]own or e[x]it > d
Internal D-tree node at block 65563
[1] flag
0x02 BT_LEAF  
[2] nextindex
123
[3] freecnt
  0 [7] rsrvd NOT DISPLAYED
[4] freelist
 -1 [8] self.len 0x000001
[5] maxslot
128 [8] self.addr1 0x00
[6] stblindex
1 [9]  0x0001001b
dtree: Hit enter to see entries, [u]p or e[x]it: 
stbl[0] = 5
[1] inumber
1857
[2] next
-1
[3] namlen
8
[4] name
00000861
[5] index
9473
dtree: Press enter for next, [m]odify, [u]p, or e[x]it > 
stbl[1] = 6
[1] inumber
1858
[2] next
-1
[3] namlen
8
[4] name
00000862
[5] index
9484
dtree: Press enter for next, [m]odify, [u]p, or e[x]it > 
stbl[2] = 7
[1] inumber
1859
[2] next
-1
[3] namlen
8
[4] name
00000863
[5] index
9495
dtree: Press enter for next, [m]odify, [u]p, or e[x]it > x

Now that I know what's causing the corruption I can mitigate the problem by limiting file deletions but that's not a good long term solution. I've also got several corrupt volumes which need to be repaired... In the past I've had success running jfs_fsck and letting it dump the contents of the root node in lost+found and recover the content from there. But this is time consuming and I would love a fix to jfs_fsck which could repair the file system.

Again, I can reproduce this problem on demand and am happy to run additional tests.

Thanks,
Tim