Hi Derek,

Setting txpower kicks start the queue immediately. Thanks alot.

Now my question is what hangs the card. What does txqactive() do and why it keeps returning false? My understanding is that the card stops a tx queue when it is sending a frame and starts the queue after transmission. txqactive() returns false when the tx queue is stopped. If txqactive() always returns false that means the card is stucked at a frame. Is this correct?

I see ath_tx_timeout() calls ath_reset(). Why is this func not called when the card hangs?


Ming.


On Sun, Sep 28, 2008 at 6:38 AM, Derek Smithies <derek@indranet.co.nz> wrote:
Hi,
 do an iwlist ath0 scan
 - does this clear the queue?
 - other iwlist/iwconfig type commands might fix it..
 Hmm,
  iwconfig ath0 txpower old_setting_for_power

 could fix it - this does a complete reset of the hal. It does not
reset the slot time, but it is worth trying..

using iwconfig to set the power is less invasive than the scan option.

Please report back if either of these commands help.

Derek.



On Sun, 28 Sep 2008, ming li wrote:

Hi,

We are experiencing a problem in the madwifi driver, which makes it unable
to
send any packets out of the interface. We are using an Atheros XXX card, in
the
Ah-demo mode. And our packets belong to the Background Tos and Voice Tos.

We observe that under heavy load, or long (or flaky) links the outgoing
queue
(as seen by tc -s qdisc) would be full and packets would be going out.

On twiddling with athdebug, we found that in the function
ath_tx_tasklet_q0123,
the ath_tx_processq was not being called (probably because txqactive()
always
returns false).

What's more baffling is that this problem most of the time only affects
packets
being sent through the non best effort queue. That is, packets sent on say
the
background queue (tos 0x8) are not sent out of the interface, and tc -s
qdisc
shows them as dropped. Whereas packets sent on the best effort queue (Tos of
0)
go without any problem -- and the debug mesasge at the start of
ath_tx_processq
is seen.

We have tried various workarounds -- like a) disabling the timing out of
entries from ic->ic_sta, b) repeatedly sending background pings via
the best effort and background queues to 'force' pkts out and
ath_tx_tasklet_q0123 being called. c) Increasing the priority of ksoftirqd
threads to enable the queue to be drained quickly Etc.

All these approaches fail leaving us with the only option of destroying and
recreating the interface, which ofcourse clears up the queue.

For some reason, this happens more often over either bad links or when the
sender sees a heavy network load -- like when multiple transfers are taking
place in its vicinity.

We are using madwifi 0.9.3.2 version in 802.11b (11 Mbps), Adhoc-demo mode
and
setting Wmm parameters.  The kernel version is 2.6.21. The wifi0 has a
pfifo_fast queue, whereas ath0 queing disabled. This is being tested on a
Intel
Mac mini with an 802.11b/g Atheros card:

/* lspci output */

02:00.0 Ethernet controller: Atheros Communications, Inc. Unknown
device 001c (rev 01)
 Subsystem: Apple Computer Inc. Unknown device 0086
 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
 Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
 Latency: 0, Cache Line Size: 256 bytes
 Interrupt: pin A routed to IRQ 16
 Region 0: Memory at 90100000 (64-bit, non-prefetchable) [size=64K]
 Capabilities: [40] Power Management version 2
     Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
     Status: D0 PME-Enable- DSel=0 DScale=0 PME-
 Capabilities: [50] Message Signalled Interrupts: Mask- 64bit-
Queue=0/0 Enable-
     Address: 00000000  Data: 0000
 Capabilities: [60] Express Legacy Endpoint IRQ 0
     Device: Supported: MaxPayload 128 bytes, PhantFunc 0, ExtTag-
     Device: Latency L0s <512ns, L1 <64us
     Device: AtnBtn- AtnInd- PwrInd-
     Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported-
     Device: RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
     Device: MaxPayload 128 bytes, MaxReadReq 512 bytes
     Link: Supported Speed 2.5Gb/s, Width x1, ASPM L0s L1, Port 0
     Link: Latency L0s <512ns, L1 <64us
     Link: ASPM Disabled RCB 128 bytes CommClk+ ExtSynch-
     Link: Speed 2.5Gb/s, Width x1
 Capabilities: [90] MSI-X: Enable- Mask- TabSize=1
     Vector table: BAR=0 offset=00000000
     PBA: BAR=0 offset=00000000
 Capabilities: [100] Advanced Error Reporting
 Capabilities: [140] Virtual Channel

Has anyone seen something like this ?. What workaround could we try ?

Under what condition will txqactive() returns false? It calls
ah_getTxIntrQueue() in hal which
is close source.

PS: i. We have verified that the same problem happens with a 0.9.4 madwifi
release also.
ii. Please feel free to ask for additional information.
iii. The associated madwifi ticket is given at
https://madwifi.org/ticket/2140

Thanks.


--
Derek Smithies Ph.D.
IndraNet Technologies Ltd.
Email: derek@indranet.co.nz
ph +64 3 365 6485
Web: http://www.indranet-technologies.com/