Thread: [Madwifi-devel] Oops on module unload
Status: Beta
Brought to you by:
otaku
From: Timothy B. T. <tte...@vt...> - 2004-06-29 04:17:17
|
First, the gritty details: Driver info: Jun 28 19:55:48 mado ath_hal: 0.9.9.12 Jun 28 19:55:48 mado wlan: 0.7.3.2 BETA Jun 28 19:55:48 mado ath_pci: 0.8.6.1 BETA Jun 28 19:55:48 mado ath_pci: 0.8.6.1 BETA Jun 28 19:55:49 mado Setup queue (0) for WME_AC_BK Jun 28 19:55:49 mado Setup queue (1) for WME_AC_BE Jun 28 19:55:49 mado Setup queue (2) for WME_AC_VI Jun 28 19:55:49 mado Setup queue (3) for WME_AC_VO Jun 28 19:55:49 mado divert: allocating divert_blk for ath0 Jun 28 19:55:49 mado ath0: mac 5.6 phy 4.1 5ghz radio 1.7 2ghz radio 2.3 Jun 28 19:55:49 mado ath0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps Jun 28 19:55:49 mado ath0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps Jun 28 19:55:49 mado ath0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps Jun 28 19:55:49 mado ath0: turbo rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps Jun 28 19:55:49 mado ath0: 802.11 address: 00:0f:20:95:93:75 Jun 28 19:55:49 mado ath0: Atheros 5212: mem=0x90080000, irq=11 This is from the CVS HEAD branch, updated Sunday evening. System info: Linux mado 2.6.5-gentoo-r1 #4 Tue May 25 10:25:28 EDT 2004 i686 Intel(R) Pentium(R) M processor 1600MHz GenuineIntel GNU/Linux And the oops: Jun 28 19:58:10 mado divert: freeing divert_blk for ath0 Jun 28 19:58:10 mado rc-scripts: ath0: error fetching interface information: Device not found Jun 28 19:58:10 mado rc-scripts: ath0 does not exist Jun 28 19:58:10 mado ath_pci: driver unloaded Jun 28 19:58:10 mado wlan: driver unloaded Jun 28 19:58:10 mado ath_hal: driver unloaded Jun 28 19:58:12 mado Unable to handle kernel paging request at virtual address 02000040 Jun 28 19:58:12 mado printing eip: Jun 28 19:58:12 mado c029e586 Jun 28 19:58:12 mado *pde = 00000000 Jun 28 19:58:12 mado Oops: 0000 [#1] Jun 28 19:58:12 mado PREEMPT Jun 28 19:58:12 mado CPU: 0 Jun 28 19:58:12 mado EIP: 0060:[<c029e586>] Tainted: P Jun 28 19:58:12 mado EFLAGS: 00010212 (2.6.5-gentoo-r1) Jun 28 19:58:12 mado EIP is at rtnetlink_fill_ifinfo+0x2c6/0x4c0 Jun 28 19:58:12 mado eax: 02000034 ebx: 00000004 ecx: 00000f38 edx: e5e3e048 Jun 28 19:58:12 mado esi: f763f000 edi: 00000004 ebp: 00000000 esp: e67afdfc Jun 28 19:58:12 mado ds: 007b es: 007b ss: 0068 Jun 28 19:58:12 mado Process dhcpcd (pid: 8161, threadinfo=e67ae000 task=f3f1f420) Jun 28 19:58:12 mado Stack: f5af89c0 00000004 00000004 e67afe1c 00000f80 00000011 00000f38 e5e3e000 Jun 28 19:58:12 mado 000f0012 f5af89c0 00000000 00000011 f763f000 c029ea4a f5af89c0 f763f000 Jun 28 19:58:12 mado 00000011 00000000 00000000 ffffffff c0399420 f763f000 00000006 e67ae000 Jun 28 19:58:12 mado Call Trace: Jun 28 19:58:12 mado [<c029ea4a>] rtmsg_ifinfo+0x5a/0xe0 Jun 28 19:58:12 mado [<c029ef45>] rtnetlink_event+0x35/0x68 Jun 28 19:58:12 mado [<c012899d>] notifier_call_chain+0x2d/0x50 Jun 28 19:58:12 mado [<c0296dd3>] netdev_wait_allrefs+0xe3/0x130 Jun 28 19:58:12 mado [<c0296f4b>] netdev_run_todo+0x12b/0x220 Jun 28 19:58:12 mado [<c02d8085>] devinet_ioctl+0x295/0x620 Jun 28 19:58:12 mado [<c02da893>] inet_ioctl+0xe3/0x130 Jun 28 19:58:12 mado [<c0308a61>] packet_ioctl+0x1d1/0x200 Jun 28 19:58:12 mado [<c028ccd5>] sock_ioctl+0x115/0x330 Jun 28 19:58:12 mado [<c016726c>] sys_ioctl+0x11c/0x2b0 Jun 28 19:58:12 mado [<c011f4fe>] sys_time+0x2e/0x70 Jun 28 19:58:12 mado [<c0107429>] sysenter_past_esp+0x52/0x71 Jun 28 19:58:12 mado Jun 28 19:58:12 mado Code: 8b 40 0c 31 db ba ff ff ff ff 89 d1 83 c0 08 89 c7 89 44 24 And now my high-level interpretation: In net/core/dev.c: When unregsiter_netdev() gets called, it calls the unregister_netdevice_notifier chain, announcing the unload. This appears to work okay. However, the actual device deregistration must wait until the reference count on the device drops to zero. unregister_netdevice appears to call net_set_todo() to defer this to later. netdev_run_todo(), called later, does the wait with a call to netdev_wait_allrefs(). This waits for dev->refcnt to drop to 0, and if it does not after a certain amount of time, it calls down the notifier chain AGAIN. This second call to the notifiers is where the oops occurs. Specifically, in net/core/rtnetlink.c, rtnetlink_fill_ifinfo is called to get information about the device. As near as I can tell, the oops occurs on line 188, where it attempts to access dev->qdisc_sleeping->ops->id. I'm not entirely convinced that this is a madwifi problem and not a kernel problem, since I don't believe madwifi is doing anything to manage the Qdisc structures. But this is my first attempt at kernel debugging, so I could be wrong. The code that _would_ worry me is the call to dev->get_stats(), which looks as if it would reach madwifi code after the unregister_netdev() call has returned. But it appears to oops before that point. The oops only happens sometimes. I attribute this to the fact that the second call to notifier_call_chain() only occurs if the reference count has not dropped to zero after a sufficient amount of time. The destructive operations net/core/dev.c appears to call between the two notifier chain events are: dev_mc_discard(dev); (seems safe) if(dev->uninit)dev->uninit(dev); (seems to be NULL, as near as I can tell) free_divert_blk(dev); (frees dev->divert, which doesn't seem to be used by rtnetlink_fill_ifinfo) netdev_unregister_sysfs(dev); (frees dev->class_dev, which doesn't seem to be used by rtnetlink_fll_ifinfo) So I'm not quite sure who's corrupting the qdisc memory. |
From: Sam L. <sa...@er...> - 2004-06-29 05:07:52
|
Please use the WPA branch. Sam |