|
From: EpsiloN E. <ep...@gm...> - 2012-07-19 05:55:06
|
Hi,
In parallel I raised same question on QuakeNet #debian channel. After some
research of iscsi_trgt.ko ix found :
16:19 <@ix> i checked on you iscsi problem and unfotunately it's not one of
the easily fixed ones | the BUG is caused be disk_check_ua
17:11 <@ix> hard to tell. disk_check_ua tries to fetch a unit attention
structure from a linked list and disk_check_ua does not expect that there
is no such
structure. which is why it's using the fetched ua without
checking, hence NULL pointer deref when there is no such structure | i
don't know much
about the scsi subsystem, but my best guess would be that newer
kernels also might not always deliver an ua | you could try to handle the
case
that there is no ua
Hopes this info will be helpful on correcting the bug.
Some time after I upgraded kernel to 3.2.0-0.bpo.2-686-pae. This went fine,
without any problems. But there is a problem with adding DKMS:
DKMS make.log for iscsitarget-1.4.20.2 for kernel 3.2.0-0.bpo.2-686-pae
(i686)
make: Entering directory `/usr/src/linux-headers-3.2.0-0.bpo.2-686-pae'
LD /var/lib/dkms/iscsitarget/1.4.20.2/build/built-in.o
LD /var/lib/dkms/iscsitarget/1.4.20.2/build/kernel/built-in.o
CC [M] /var/lib/dkms/iscsitarget/1.4.20.2/build/kernel/tio.o
CC [M] /var/lib/dkms/iscsitarget/1.4.20.2/build/kernel/iscsi.o
CC [M] /var/lib/dkms/iscsitarget/1.4.20.2/build/kernel/nthread.o
CC [M] /var/lib/dkms/iscsitarget/1.4.20.2/build/kernel/wthread.o
/var/lib/dkms/iscsitarget/1.4.20.2/build/kernel/wthread.c: In function
‘worker_thread’:
/var/lib/dkms/iscsitarget/1.4.20.2/build/kernel/wthread.c:75: error:
implicit declaration of function ‘copy_io_context’
make[4]: *** [/var/lib/dkms/iscsitarget/1.4.20.2/build/kernel/wthread.o]
Error 1
make[3]: *** [/var/lib/dkms/iscsitarget/1.4.20.2/build/kernel] Error 2
make[2]: *** [_module_/var/lib/dkms/iscsitarget/1.4.20.2/build] Error 2
make[1]: *** [sub-make] Error 2
make: *** [all] Error 2
make: Leaving directory `/usr/src/linux-headers-3.2.0-0.bpo.2-686-pae'
Any ideas how to solve this issue? Hope to you feedback, thanks.
Regards,
Igor.
On Mon, Jul 16, 2012 at 3:34 PM, EpsiloN EpsiloN <ep...@gm...> wrote:
> Hi,
>
> I measured IO for some time before and after I tuned (packet elevator
> disabled, extended surface scan delay, write cache 100%, enabled device
> write cache) HP SmartArray. During measurements I didn't find any problems
> with IO again before and after.
>
> Here are my results taken after SmartArtay tuning:
>
> # sar
> 12:00:01 AM CPU %usr %nice %sys %iowait
> %steal %irq %soft %guest %idle
> Average: all 0.01 0.00 1.64 0.00 0.00
> 0.02 0.52 0.00 97.81
>
> 12:00:01 AM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s
> pgscand/s pgsteal/s %vmeff
> Average: 17.21 6512.16 72.98 0.00 1730.86
> 0.00 0.00 0.00 0.00
>
> 12:00:01 AM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit
> %commit
> Average: 2846544 258640 8.33 114553 77173
> 488793 13.6
>
> 12:00:01 AM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s
> txcmp/s rxmcst/s
> Average: lo 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00
> Average: eth2 112.56 0.14 35.88 0.01
> 0.00 0.00 0.00
> Average: eth3 4840.63 0.14 31.34 0.01
> 0.00 0.00 0.00
> Average: eth0 15.56 2509.20 6.05 55.30
> 0.00 0.00 0.02
> Average: eth1 12.15 0.14 0.96 0.01
> 0.00 0.00 0.02
>
> 12:00:01 AM IFACE rxerr/s txerr/s coll/s rxdrop/s txdrop/s
> txcarr/s rxfram/s rxfifo/s txfifo/s
> Average: lo 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00
> Average: eth2 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00
> Average: eth3 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00
> Average: eth0 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00
> Average: eth1 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00
>
> After this tuning server worked for a week, maybe a little bit more and
> stuck again:
>
> # messages
> Jul 14 11:23:46 ISCSI kernel: [951754.937998] BUG: unable to handle kernel
> NULL pointer dereference at 00000010
>
> Any idea what can be done? Upgrade to newest kernel? Just to remind, right
> now I am using Debian 6.0.5 with 2.6.32-5-686 kernel.
>
>
>
>
> Another interesting thing that Debian changes MAC addressed of of the
> interfaces:
>
> # daemon.log
> Jul 14 01:16:56 ISCSI ntpd[2355]: 192.168.10.3 interface 192.168.10.12 ->
> 192.168.10.9
> Jul 14 01:26:56 ISCSI ntpd[2355]: 192.168.10.3 interface 192.168.10.9 ->
> 192.168.10.8
> Jul 14 01:36:56 ISCSI ntpd[2355]: 192.168.10.3 interface 192.168.10.8 ->
> 192.168.10.12
> Jul 14 01:46:56 ISCSI ntpd[2355]: 192.168.10.3 interface 192.168.10.12 ->
> 192.168.10.9
> Jul 14 03:26:56 ISCSI ntpd[2355]: 192.168.10.3 interface 192.168.10.9 ->
> 192.168.10.8
> Jul 14 03:36:56 ISCSI ntpd[2355]: 192.168.10.3 interface 192.168.10.8 ->
> 192.168.10.12
> Jul 14 03:46:56 ISCSI ntpd[2355]: 192.168.10.3 interface 192.168.10.12 ->
> 192.168.10.12
> Jul 14 03:56:56 ISCSI ntpd[2355]: 192.168.10.3 interface 192.168.10.12 ->
> 192.168.10.9
> Jul 14 04:06:56 ISCSI ntpd[2355]: 192.168.10.3 interface 192.168.10.9 ->
> 192.168.10.12
> Jul 14 04:16:56 ISCSI ntpd[2355]: 192.168.10.3 interface 192.168.10.12 ->
> 192.168.10.9
> Jul 14 04:26:56 ISCSI ntpd[2355]: 192.168.10.3 interface 192.168.10.9 ->
> 192.168.10.8
> Jul 14 04:36:56 ISCSI ntpd[2355]: 192.168.10.3 interface 192.168.10.8 ->
> 192.168.10.12
> Jul 14 05:36:56 ISCSI ntpd[2355]: 192.168.10.3 interface 192.168.10.12 ->
> 192.168.10.12
>
> # dmesg
> [ 8.700280] udev[591]: renamed network interface eth0 to eth0-eth2
> [ 8.700754] udev[592]: renamed network interface eth1 to eth1-eth3
> [ 8.708680] udev[562]: renamed network interface eth2 to eth0
> [ 8.709146] udev[565]: renamed network interface eth3 to eth1
> [ 8.750784] udev[591]: renamed network interface eth0-eth2 to eth2
> [ 8.751307] udev[592]: renamed network interface eth1-eth3 to eth3
>
> This some kind of "Layer 2 fail-over". My server has 4 physical NICs and 4
> IPs assigned to ethX interfaces. I take three cables off and was able still
> ping all four IP addreses. My ARP table showed that a;; server's IP
> adressed now resolve to same MAC, MAC of the remained interface. Is this
> some kind on normal behavior?
>
> Regards,
> Igor.
>
>
>
> On Fri, Jun 29, 2012 at 2:45 PM, EpsiloN EpsiloN <ep...@gm...>wrote:
>
>> Hello,
>>
>> Thanks for the replies.
>> So I will measure the IO that filesystem is requesting and IO that I get
>> from RAID controller.
>> And create as many targets as possible. I will let you know about results.
>>
>> Regards,
>> Igor.
>>
>>
>> On Fri, Jun 29, 2012 at 1:45 PM, Emmanuel Florac <ef...@in...>wrote:
>>
>>> Le Thu, 28 Jun 2012 19:59:34 +0300
>>> Pasi Kärkkäinen <pa...@ik...> écrivait:
>>>
>>> > IETD shouldn't crash anyway..
>>> >
>>>
>>> Sure. As I said, he triggered a bug. However as he's using the standard
>>> Debian port, his best bet is a workaround rather than switching to
>>> trunk and recompile (without any warranty that this particular bug has
>>> been squashed).
>>>
>>> --
>>> ------------------------------------------------------------------------
>>> Emmanuel Florac | Direction technique
>>> | Intellique
>>> | <ef...@in...>
>>> | +33 1 78 94 84 02
>>> ------------------------------------------------------------------------
>>>
>>
>>
>
|