Cooperative Linux / Bugs / #146 GPF trying to activate raid5 md array in 0.7.3

Arya - 2009-01-20

The 100% cpu crash seems reproducible if the cobd array is the first one I try to create that boot (i.e. the modules are being loaded for the first time?)

Also (though not verified), I get the 100% cpu crash when making an array based on loop block devices if the loop devices are created to point to files using relative paths rather than absolute?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Henry N. - 2009-01-20

I assume, that the "pIII_sse" use a special register, we not have saved and restored in the passage page. That can crash the "math_state_restore".

One option would be to support these xmm-registers for "sse", but currently I don't have idea how.
The problem can be near the XMMS_SAVE/XMMS_RESTORE in top of include/asm-i386/xor.h:

#define XMMS_SAVE do { \ preempt_disable(); \ cr0 = read_cr0(); \ clts(); \

These operations are well known candidate for crash and for endless page faults. Same we know from function math_state_restore in arch/i386/traps.c. A "clts" while hardware interrupts are enabled can crash coLinux. maniputating the register cr0 is also a high risk.

The function "xor_block_pIII_sse" with special macros XMMS_SAVE/XMMS_RESTORE needs to check separately in a kernel test module, outside of the raid. Think, there needs something to do.

As temporally idea have disabled the usage of xmm- and mmx-registers under coLinux for the xor-raid-functions. Please check the next autobuild.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Henry N. - 2009-01-20

Loading and unloading a special version of module "raid456.ko" with only calling the function "calibrate_xor_block" does exactly your results: Crashing in math_state_restore, lots of page faults, and windows can not shutting down.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Henry N. - 2009-01-20

assigned_to: nobody --> henryn

status: open --> open-accepted
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Arya - 2009-01-21

Excellent, thanks for tracking that down, henryn. I'll test the daily tomorrow, with the temporary fix, if it will still help?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Arya - 2009-01-21

Bad news,

md: md0 stopped.
md: bind<loop1>
md: bind<loop2>
md: bind<loop3>
md: bind<loop0>
md: md0 stopped.
md: unbind<loop0>
md: export_rdev(loop0)
md: unbind<loop3>
md: export_rdev(loop3)
md: unbind<loop2>
md: export_rdev(loop2)
md: unbind<loop1>
md: export_rdev(loop1)
md: bind<loop1>
md: bind<loop2>
md: bind<loop3>
md: bind<loop0>
raid5: measuring checksumming speed
8regs : 1292.800 MB/sec
8regs_prefetch: 1210.400 MB/sec
32regs : 1184.000 MB/sec
32regs_prefetch: 1428.800 MB/sec
raid5: using function: 32regs_prefetch (1428.800 MB/sec)
raid6: int32x1 605 MB/s
raid6: int32x2 563 MB/s
raid6: int32x4 444 MB/s
raid6: int32x8 424 MB/s
raid6: mmxx1 1355 MB/s
raid6: mmxx2 1579 MB/s
raid6: sse1x1 758 MB/s
raid6: sse1x2 1541 MB/s
raid6: sse2x1 1667 MB/s

colinux-console-nt only shows this much, which is missing the registers / EIP dump...

[<c0103c79>] show_stack_log_lvl+0xa9/0xd0
[<c01040eb>] show_registers+0x21b/0x3a0
[<c0104365>] die+0xf5/0x210
[<c010b6cc>] do_page_fault+0x38c/0x6e0
[<c03057fa>] error_code+0x6a/0x70
[<c010c5c8>] deactivate_task+0x18/0x30
[<c03035d7>] __sched_text_start+0x377/0x670
[<c0114814>] do_exit+0x7f4/0x960
[<c010447d>] die+0x20d/0x210
[<c010b6cc>] do_page_fault+0x38c/0x6e0
[<c03057fa>] error_code+0x6a/0x70
[<c010c5c8>] deactivate_task+0x18/0x30
[<c03035d7>] __sched_text_start+0x377/0x670
[<c0114814>] do_exit+0x7f4/0x960
[<c010447d>] die+0x20d/0x210
[<c010b6cc>] do_page_fault+0x38c/0x6e0
[<c03057fa>] error_code+0x6a/0x70
[<c010c5c8>] deactivate_task+0x18/0x30
[<c03035d7>] __sched_text_start+0x377/0x670
[<c0114814>] do_exit+0x7f4/0x960
[<c010447d>] die+0x20d/0x210
[<c010b6cc>] do_page_fault+0x38c/0x6e0
[<c03057fa>] error_code+0x6a/0x70
[<c010c5c8>] deactivate_task+0x18/0x30

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Arya - 2009-01-21

The previous post was with 0.8.0 daily. Here I tried again, creating a new array rather than reassembling like with the previous example. It seems to have worked?

md: bind<loop0>
md: bind<loop1>
md: bind<loop2>
md: bind<loop3>
raid5: measuring checksumming speed
8regs : 1577.200 MB/sec
8regs_prefetch: 1398.800 MB/sec
32regs : 788.000 MB/sec
32regs_prefetch: 861.600 MB/sec
raid5: using function: 8regs (1577.200 MB/sec)
raid6: int32x1 323 MB/s
raid6: int32x2 321 MB/s
raid6: int32x4 277 MB/s
raid6: int32x8 258 MB/s
raid6: mmxx1 739 MB/s
raid6: mmxx2 992 MB/s
raid6: sse1x1 462 MB/s
raid6: sse1x2 873 MB/s
raid6: sse2x1 1012 MB/s
raid6: sse2x2 1511 MB/s
raid6: using algorithm sse2x2 (1511 MB/s)
md: raid6 personality registered for level 6
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
raid5: device loop2 operational as raid disk 2
raid5: device loop1 operational as raid disk 1
raid5: device loop0 operational as raid disk 0
raid5: allocated 4196kB for md0
raid5: raid level 5 set md0 active with 3 out of 4 devices, algorithm 2
RAID5 conf printout:
--- rd:4 wd:3
disk 0, o:1, dev:loop0
disk 1, o:1, dev:loop1
disk 2, o:1, dev:loop2
RAID5 conf printout:
--- rd:4 wd:3
disk 0, o:1, dev:loop0
disk 1, o:1, dev:loop1
disk 2, o:1, dev:loop2
disk 3, o:1, dev:loop3
md: recovery of RAID array md0
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
md: using 128k window, over a total of 10176 blocks.
md: md0: recovery done.
RAID5 conf printout:
--- rd:4 wd:4
disk 0, o:1, dev:loop0
disk 1, o:1, dev:loop1
disk 2, o:1, dev:loop2
disk 3, o:1, dev:loop3

Works ok with cobdX devices, I don't know why it crashed the firs time?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Arya - 2009-01-21

Double-check same issue for raid6 (sse2x2?):

arya@co-calculon:~/raid-test$ sudo mdadm --create /dev/md0 --level 6 --raid-devices=4 /dev/cobd{5,6,7,8}

md: md0: raid array is not clean -- starting background reconstruction
raid5: device cobd8 operational as raid disk 3
raid5: device cobd7 operational as raid disk 2
raid5: device cobd6 operational as raid disk 1
raid5: device cobd5 operational as raid disk 0
raid5: allocated 4196kB for md0
raid5: raid level 6 set md0 active with 4 out of 4 devices, algorithm 2
RAID5 conf printout:
--- rd:4 wd:4
disk 0, o:1, dev:cobd5
disk 1, o:1, dev:cobd6
disk 2, o:1, dev:cobd7
disk 3, o:1, dev:cobd8
md: resync of RAID array md0
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync.
md: using 128k window, over a total of 10176 blocks.
general protection fault: 0000 [#1]
PREEMPT
Modules linked in: raid456 xor md_mod ipv6 fuse
CPU: 0
<0>EIP: 0060:[<c0103d46>] Not tainted VLI
<0>EFLAGS: 00010002 (2.6.22.18-co-0.8.0 #1)
EIP is at math_state_restore+0x26/0x50
eax: 8005003b ebx: c15efab0 ecx: ffffffff edx: 00000000
esi: da812000 edi: 000003a0 ebp: da813db8 esp: da813db0
ds: 007b es: 007b fs: 0000 gs: 0000 ss: 0068
Process md0_raid5 (pid: 2795, ti=da812000 task=c15efab0 task.ti=da812000)
<0>Stack: da813e3c 000003a0 da813e24 c01038fe da813e3c ffffffff db686000 000003a0
<0> 000003a0 da813e24 db6863a0 0000007b 0000007b da810000 ffffffff e08c2d8b
<0> 00000060 00010206 000003a0 00001000 00000004 da813e44 da813e40 db687000
<0>Call Trace:
[<c0103bba>] show_trace_log_lvl+0x1a/0x30
[<c0103c79>] show_stack_log_lvl+0xa9/0xd0
[<c01040eb>] show_registers+0x21b/0x3a0
[<c0104365>] die+0xf5/0x210
[<c010534d>] do_general_protection+0x1ad/0x1f0
[<c03057fa>] error_code+0x6a/0x70
[<c01038fe>] device_not_available+0x2e/0x33
[<e08bdf8f>] compute_parity6+0x19f/0x330 [raid456]
[<e08bfa40>] handle_stripe+0x1550/0x16f0 [raid456]
[<e08c1007>] raid5d+0x2f7/0x450 [raid456]
[<e0846b00>] md_thread+0x30/0x100 [md_mod]
[<c0124462>] kthread+0x42/0x70
[<c01039c7>] kernel_thread_helper+0x7/0x10
=======================
Code: c3 8d 74 26 00 55 89 e5 83 ec 08 89 74 24 04 89 e6 89 1c 24 81 e6 00 e0 ff ff 8b 1e 0f 06 f6 43 0d 20 75 07 89 d8 e8 ea 38 00 00 <0f> ae
8b 10 02 00 00 83 4e 0c 01 fe 83 8d 01 00 00 8b 1c 24 8b
EIP: [<c0103d46>] math_state_restore+0x26/0x50 SS:ESP 0068:da813db0
note: md0_raid5[2795] exited with preempt_count 2

Related

Patches: #1

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Arya - 2009-01-21

Double-check same issue for raid6 (sse2x2?):

arya@co-calculon:~/raid-test$ sudo mdadm --create /dev/md0 --level 6 --raid-devices=4 /dev/cobd{5,6,7,8}

md: md0: raid array is not clean -- starting background reconstruction
raid5: device cobd8 operational as raid disk 3
raid5: device cobd7 operational as raid disk 2
raid5: device cobd6 operational as raid disk 1
raid5: device cobd5 operational as raid disk 0
raid5: allocated 4196kB for md0
raid5: raid level 6 set md0 active with 4 out of 4 devices, algorithm 2
RAID5 conf printout:
--- rd:4 wd:4
disk 0, o:1, dev:cobd5
disk 1, o:1, dev:cobd6
disk 2, o:1, dev:cobd7
disk 3, o:1, dev:cobd8
md: resync of RAID array md0
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for resync.
md: using 128k window, over a total of 10176 blocks.
general protection fault: 0000 [#1]
PREEMPT
Modules linked in: raid456 xor md_mod ipv6 fuse
CPU: 0
<0>EIP: 0060:[<c0103d46>] Not tainted VLI
<0>EFLAGS: 00010002 (2.6.22.18-co-0.8.0 #1)
EIP is at math_state_restore+0x26/0x50
eax: 8005003b ebx: c15efab0 ecx: ffffffff edx: 00000000
esi: da812000 edi: 000003a0 ebp: da813db8 esp: da813db0
ds: 007b es: 007b fs: 0000 gs: 0000 ss: 0068
Process md0_raid5 (pid: 2795, ti=da812000 task=c15efab0 task.ti=da812000)
<0>Stack: da813e3c 000003a0 da813e24 c01038fe da813e3c ffffffff db686000 000003a0
<0> 000003a0 da813e24 db6863a0 0000007b 0000007b da810000 ffffffff e08c2d8b
<0> 00000060 00010206 000003a0 00001000 00000004 da813e44 da813e40 db687000
<0>Call Trace:
[<c0103bba>] show_trace_log_lvl+0x1a/0x30
[<c0103c79>] show_stack_log_lvl+0xa9/0xd0
[<c01040eb>] show_registers+0x21b/0x3a0
[<c0104365>] die+0xf5/0x210
[<c010534d>] do_general_protection+0x1ad/0x1f0
[<c03057fa>] error_code+0x6a/0x70
[<c01038fe>] device_not_available+0x2e/0x33
[<e08bdf8f>] compute_parity6+0x19f/0x330 [raid456]
[<e08bfa40>] handle_stripe+0x1550/0x16f0 [raid456]
[<e08c1007>] raid5d+0x2f7/0x450 [raid456]
[<e0846b00>] md_thread+0x30/0x100 [md_mod]
[<c0124462>] kthread+0x42/0x70
[<c01039c7>] kernel_thread_helper+0x7/0x10
=======================
Code: c3 8d 74 26 00 55 89 e5 83 ec 08 89 74 24 04 89 e6 89 1c 24 81 e6 00 e0 ff ff 8b 1e 0f 06 f6 43 0d 20 75 07 89 d8 e8 ea 38 00 00 <0f> ae
8b 10 02 00 00 83 4e 0c 01 fe 83 8d 01 00 00 8b 1c 24 8b
EIP: [<c0103d46>] math_state_restore+0x26/0x50 SS:ESP 0068:da813db0
note: md0_raid5[2795] exited with preempt_count 2

Related

Patches: #1

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Arya - 2009-01-21

Whoops, duplicate post.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Henry N. - 2009-01-21

> raid6: mmxx1 739 MB/s
> raid6: mmxx2 992 MB/s
> raid6: sse1x1 462 MB/s
> raid6: sse1x2 873 MB/s
> raid6: sse2x1 1012 MB/s
> raid6: sse2x2 1511 MB/s

Oh yea, for raid6 some more "sse" operations are hidden in other files.
I will fix it more generic, by disable all XMM/MMX features in the cpu caps. Please stand by and check the next autobuild.

> [...]
> Works ok with cobdX devices, I don't know why it crashed the firs time?

I'm afraid, the old module "xor.ko" was always loaded at the time you have extracted the modules to /lib/modules. You need to reboot after update the modules.

For the old raid5 I was testing very simple:
while true; do rmmod xor; insmod xor.ko; sleep 1; done
After 2...10 loops it's crashing.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Henry N. - 2009-01-22

Hello Arya,

it should be fixed generally now for all raid levels, and all other drivers.

You will find the build under http://www.colinux.org/snapshots/ and in autobuild.
Please remember to reboot after you have updated modules in your directory /lib/modules, and replace your kernel. In the kernel file "vmlinux" is the most important change.

A very simple test for the crash was:
while true; do rmmod raid456; rmmod xor; modprobe xor; modprobe raid456; sleep 1; done

PS: I don't know why SF has dropped my post from yesterday here. :-(

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nobody/Anonymous - 2009-01-24

The temporary fix seems to be working fine for raid5 and raid6 under the cases I tested before. Thanks!

P.S. SSE2 flags is present in /proc/cpuinfo, but the RAID6 benchmark doesn't include sse2 anymore. Was that intentional?

Cheers

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Henry N. - 2009-01-24

status: open-accepted --> closed-accepted
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Henry N. - 2009-01-24

SSE2 is an extension of SSE. It is harmless, as long SSE is not enabled. SSE2 operations are not used without SSE support.

In the kernel SSE is the macro X86_FEATURE_XMM (Streaming SIMD Extensions). SSE2 is X86_FEATURE_XMM2.

The general problem is, that inside usage of MMX or SSE registers we can not allow an OS switch. These registers are not saved for OS switch. In native Linux kernel this is handled by disabling interrupts or by calling the preempt_disable() or kernel_fpu_begin(). But under coLinux we needs an OS switch for example for page allocations. Such will ends in an endless FPU fault loop.

I will close this bug now and write some about SSE and MMX into the ToDo list.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Henry N. - 2009-02-05

Problems from disabling MMX/XMM in CPU caps we have seen on Bug#2551241.

SSE-instructions needs to disable in kernel space only. SSE was used only for raid modules. Userland SSE-instractions should not disable. Eclipse/Java needs SIMD-Exceptions.

So, only Raid modules are disabled for such instructions now.
Committed as SVN revision r1212.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Henry N. - 2009-02-05

status: closed-accepted --> pending-accepted
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

SourceForge Robot - 2009-03-18

This Tracker item was closed automatically by the system. It was
previously set to a Pending status, and the original submitter
did not respond within 14 days (the time period specified by
the administrator of this Tracker).

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

SourceForge Robot - 2009-03-18

status: pending-accepted --> closed-accepted
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

GPF trying to activate raid5 md array in 0.7.3

Run Linux on Windows or other OSes, natively.

Group

Searches

Help

#146 GPF trying to activate raid5 md array in 0.7.3

Discussion

Related

Related