Thread: [Madwifi-devel] madwifi-0.9.4: warning after netlink communication
Status: Beta
Brought to you by:
otaku
From: Gregory G. <gre...@gm...> - 2009-04-05 00:05:46
|
Hello, My kernel version is 2.6.27-11-generic. I'm using the current madwifi-0.9.4 branch. I've added a netlink communication to the driver but when I start send messages from kernel to user space warning appears: [ 1618.441861] WARNING: at /build/buildd/linux-2.6.27/kernel/softirq.c:136 local_bh_enable+0x8b/0xc0() [ 1618.441866] Modules linked in: wlan_scan_sta ath_rate_sample ath_pci wlan ath_hal(P) af_packet binfmt_misc rfcomm bridge stp bnep sco l2cap bluetooth ipv6 vboxdrv ppdev powernow_k8 cpufreq_conservative cpufreq_powersave cpufreq_ondemand cpufreq_userspace cpufreq_stats freq_table pci_slot sbs sbshc wmi video output container battery iptable_filter ip_tables x_tables ac sbp2 lp pcmcia evdev psmouse serio_raw pcspkr k8temp yenta_socket rsrc_nonstatic pcmcia_core isp1760 shpchp pci_hotplug snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi parport_pc parport snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device snd soundcore snd_page_alloc button i2c_nforce2 i2c_core ext3 jbd mbcache sd_mod crc_t10dif sr_mod cdrom sg sata_nv pata_acpi pata_amd ohci1394 ata_generic ieee1394 libata forcedeth ehci_hcd ohci_hcd scsi_mod dock usbcore thermal processor fan fbcon tileblit font bitblit softcursor fuse [ 1618.441982] Pid: 5422, comm: Xorg Tainted: P 2.6.27-11-generic #1 [ 1618.441986] [ 1618.441987] Call Trace: [ 1618.441991] <IRQ> [<ffffffff8024e9c4>] warn_on_slowpath+0x64/0x90 [ 1618.442007] [<ffffffff80502f15>] ? account_scheduler_latency+0x15/0x3c0 [ 1618.442014] [<ffffffff8023c8da>] ? __wake_up_common+0x5a/0x90 [ 1618.442021] [<ffffffff80234069>] ? __phys_addr+0x9/0x50 [ 1618.442027] [<ffffffff8039cdb9>] ? cfq_queue_empty+0x9/0x20 [ 1618.442036] [<ffffffff8038eefa>] ? elv_queue_empty+0x3a/0x50 [ 1618.442042] [<ffffffff80502f15>] ? account_scheduler_latency+0x15/0x3c0 [ 1618.442048] [<ffffffff8023e833>] ? __enqueue_entity+0x93/0xa0 [ 1618.442055] [<ffffffff80254c3b>] local_bh_enable+0x8b/0xc0 [ 1618.442061] [<ffffffff804735f5>] sk_filter+0x95/0xd0 [ 1618.442069] [<ffffffff804851bd>] netlink_unicast+0x19d/0x2e0 [ 1618.442091] [<ffffffffa0665821>] measure+0x631/0x820 [ath_pci] [ 1618.442105] [<ffffffffa06748e4>] ath_intr+0x44/0xd00 [ath_pci] [ 1618.442112] [<ffffffff802475f0>] ? wake_up_process+0x10/0x20 [ 1618.442119] [<ffffffff80503576>] ? _spin_lock_irq+0x16/0x20 [ 1618.442125] [<ffffffff8025a0d0>] ? run_timer_softirq+0x220/0x260 [ 1618.442132] [<ffffffff802717c4>] ? clockevents_program_event+0x54/0xa0 [ 1618.442138] [<ffffffff8029d5a5>] handle_IRQ_event+0x45/0x80 [ 1618.442144] [<ffffffff8029f18e>] handle_fasteoi_irq+0x9e/0x110 [ 1618.442151] [<ffffffff8021417c>] ? call_softirq+0x1c/0x30 [ 1618.442156] [<ffffffff80215b16>] do_IRQ+0x86/0x100 [ 1618.442161] [<ffffffff80212f0e>] ret_from_intr+0x0/0x29 [ 1618.442165] <EOI> [ 1618.442169] ---[ end trace 3dc99b00cfc284c0 ]--- A piece of code where I send a message from the module: NLskb=alloc_skb(NLMSG_SPACE(MAX_PAYLOAD),GFP_ATOMIC); size = NLMSG_SPACE(sizeof(msg)); NLskb = alloc_skb(size, GFP_KERNEL); if (!NLskb) { printk(KERN_ERR "MMM: Unable to allocate buffor.\n"); } NLnlh = (struct nlmsghdr *)skb_put(NLskb, size - sizeof(*NLnlh)); memcpy((struct nlmuxhdr *) NLMSG_DATA(NLnlh), &msg, sizeof(msg)); NETLINK_CB(NLskb).dst_group = 0; netlink_unicast(NLnl_sk_, NLskb, mmm.pid, GFP_ATOMIC); Where is the problem? Can anyone help me? Thanks in advice. |
From: Pavel R. <pr...@gn...> - 2009-04-05 05:59:07
|
Hello Gregory, Please don't cross-post. Discussions about the code clearly belong to madwifi-devel. Quoting Gregory Gas <gre...@gm...>: > Hello, > > My kernel version is 2.6.27-11-generic. I'm using the current madwifi-0.9.4 > branch. I've added a netlink communication to the driver but when I start > send messages from kernel to user space warning appears: > > [ 1618.441861] WARNING: at /build/buildd/linux-2.6.27/kernel/softirq.c:136 > local_bh_enable+0x8b/0xc0() Generally, it helps if you look at that line in your kernel and post the context. Different kernels have warnings at different lines, and there is no error message with the description of the problem in your post. > [ 1618.442055] [<ffffffff80254c3b>] local_bh_enable+0x8b/0xc0 > [ 1618.442061] [<ffffffff804735f5>] sk_filter+0x95/0xd0 > [ 1618.442069] [<ffffffff804851bd>] netlink_unicast+0x19d/0x2e0 > [ 1618.442091] [<ffffffffa0665821>] measure+0x631/0x820 [ath_pci] > [ 1618.442105] [<ffffffffa06748e4>] ath_intr+0x44/0xd00 [ath_pci] It's wrong to enable sortirqs in a hardware interrupt handler. Read kernel documentation about locking if you don't understand it. > A piece of code where I send a message from the module: You didn't give the function name, which could be useful to read the above stack trace. > NLskb=alloc_skb(NLMSG_SPACE(MAX_PAYLOAD),GFP_ATOMIC); > size = NLMSG_SPACE(sizeof(msg)); > NLskb = alloc_skb(size, GFP_KERNEL); If you use GFP_KERNEL in an interrupt handler, that's a problem. It's bettor to avoid doing complex operations in interrupt handlers, as they are time critical. Try localizing errors by eliminating parts of the code or by adding printk statements between the lines. The later may not work in irq or softirq context if the warning is reported in user context. Once you have one line that causes the problem, you have a better chance understanding what's wrong. > Where is the problem? I don't know. > Can anyone help me? You can help yourself by reading this: http://www.catb.org/~esr/faqs/smart-questions.html#prune -- Regards, Pavel Roskin |
From: Gregory G. <gre...@gm...> - 2009-04-06 20:55:48
|
Hello, I can't solve the problem with that warning. I've read documentation but still don't know what's wrong. [ 1618.441861] WARNING: at /build/buildd/linux-2.6.27/kernel/softirq.c:136 >> local_bh_enable+0x8b/0xc0() >> > > Generally, it helps if you look at that line in your kernel and post the > context. > Here is that context: EXPORT_SYMBOL(_local_bh_enable); static inline void _local_bh_enable_ip(unsigned long ip) { WARN_ON_ONCE(in_irq() || irqs_disabled()); //<------ line 136 #ifdef CONFIG_TRACE_IRQFLAGS local_irq_disable(); #endif /* * Are softirqs going to be turned on now: */ if (softirq_count() == SOFTIRQ_OFFSET) trace_softirqs_on(ip); /* * Keep preemption disabled until we are done with * softirq processing: */ sub_preempt_count(SOFTIRQ_OFFSET - 1); if (unlikely(!in_interrupt() && local_softirq_pending())) do_softirq(); dec_preempt_count(); #ifdef CONFIG_TRACE_IRQFLAGS local_irq_enable(); #endif preempt_check_resched(); } You didn't give the function name, which could be useful to read the above > stack trace. > The name of this function is measure. Try localizing errors by eliminating parts of the code or by adding printk > statements between the lines. > My function 'measure' (called every time when any frame received from the network) collects some statistics and send it periodically via netlink socket to the user space. I've noticed that warning occurs in line: netlink_unicast(NLnl_sk_, NLskb, mmm.pid, GFP_ATOMIC); after the function 'measure' is executed several times. At the first time the function is always executed properly and no warning occurs. When I eliminate this line from my code warning doesn't appear at any time. Thanks in advice. |
From: Pavel R. <pr...@gn...> - 2009-04-06 21:09:03
|
On Mon, 2009-04-06 at 22:55 +0200, Gregory Gas wrote: > Hello, > > I can't solve the problem with that warning. I've read documentation > but still don't know what's wrong. > > [ 1618.441861] WARNING: > at /build/buildd/linux-2.6.27/kernel/softirq.c:136 > local_bh_enable+0x8b/0xc0() > > > Generally, it helps if you look at that line in your kernel > and post the context. > > Here is that context: > > EXPORT_SYMBOL(_local_bh_enable); > > static inline void _local_bh_enable_ip(unsigned long ip) > { > WARN_ON_ONCE(in_irq() || irqs_disabled()); //<------ That's what I expected. Enabling softirqs ("bh" means bottom half, with is a variety of softirq) with in a hardware IRQ is a priority inversion. It allows execution of low priority code while the high priority code is blocked. It's a sure way to a deadlock. > You didn't give the function name, which could be useful to > read the above stack trace. > The name of this function is measure. OK, I saw it in the stack trace, so it's called from the IRQ handler. That should be avoided whenever possible. > Try localizing errors by eliminating parts of the code or by > adding printk statements between the lines. > My function 'measure' (called every time when any frame received from > the network) collects some statistics and send it periodically via > netlink socket to the user space. Sending data should be done by a separate thread that is not run in the IRQ context. > I've noticed that warning occurs in line: > > > netlink_unicast(NLnl_sk_, NLskb, mmm.pid, GFP_ATOMIC); > > after the function 'measure' is executed several times. At the first > time the function is always executed properly and no warning occurs. > When I eliminate this line from my code warning doesn't appear at any > time. It means you cannot do it in IRQ handlers. But you can schedule a task in the IRQ handler, and that task will send the message. -- Regards, Pavel Roskin |
From: Gregory G. <gre...@gm...> - 2009-04-16 16:28:53
|
Hi, finally netlink communication in my module works properly. The solution of my problem: Sending data should be done by a separate thread that is not run in the > IRQ context. > > Thank you for help. Regards, GG |