HostAP Stability (fixed??)

Joseph Chiu joseph at omnilux.net
Thu May 15 14:34:15 EDT 2003


I've been banging away on our Prism3 Bromax (Linksys) card with yesterday's
CVS.  After about 300 MB of sustained busybox-wget HTTP download at 11 mbps
(about 5 MBit/sec throughput), the transfer stalls completely and the kernel
dmesg shows non-stop INFDROP events, and then eventual reset.  After the
reset occurs, new IP traffic works, but the stalled HTTP client remains
stalled (packet loss somewhere?).

This is on a 266-MHz embedded MIPS-based system.

I'd be glad to run any experiments.


wlan0: INFDROP event
wlan0: INFDROP event
wlan0: INFDROP event
wlan0: INFDROP event
[repeats hundres of times]
wlan0: INFDROP event
wlan0: INFDROP event
wlan0: INFDROP event
wlan0: INFDROP event
wlan0: hfa384x_setup_bap - timeout after
wlan0: prism2_tx - to BAP0 failed
wlan0: scheduled card reset
hostap_cs: wlan0: resetting card
prism2_pccard_cor_sreset: original COR 41
prism2_hw_init()
prism2_hw_config: initialized in 18587 iterations
wlan0: trying to read PDA from 0x007f0000: OK
wlan0: LinkStatus=1 (Connected)
wlan0: LinkStatus: BSSID=00:60:1d:f7:72:e0

-----Original Message-----
From: hostap-admin at shmoo.com [mailto:hostap-admin at shmoo.com]On Behalf Of
Jouni Malinen
Sent: Sunday, May 11, 2003 9:14 PM
To: hostap at shmoo.com
Subject: Re: HostAP Stability (fixed??)


On Sun, May 11, 2003 at 09:57:29PM -0500, Dave Hinkle wrote:

> The vast majority of our routers are soekris based, and we're having
> constant problems with locked up cards.  If you could make the work around
> patches available, we would really appreciate it.

OK. The workaround for recovering from these hangs is now available in
CVS version. It should detect a hang situation and recover from it in
about five seconds.

Better yet, I think I found and fixed the real cause for the issue.
There was a race condition in reading and writing event mask
(local->event_mask vs. IntEn register). This enabled a scenario in which
sw irq handler for BAP0 events (hostap_bap_tasklet) unmasked those
events and was immediately interrupted by the next event. This
interrupt could happen between local->event_mask and IntEn writes..

After fixing this race (by removing local->event_mask), I have been
unable to cause the hang anymore. So far, I have sent more than 1 GB of
data through without any problems. Of course this is not a proof that
the problem is solved, but at least things look much better now. Btw,
only using TCP flood did not seem to be enough to tricker the hang. I
needed to add some printk's to print something over 9600 bps serial
console to be able to hang the system..

I did not remove the automatic recovery mechanism so that it will be
easier to get useful information about other potential issues. The
watchdog code will keep a counter of detected hangs. This is available
from /proc/net/hostap/wlan#/debug (sw_tick_stuck). Please let me know,
if you see this counter being incremented. In addition, these events
will be reported in kernel log.

--
Jouni Malinen                                            PGP id EFC895FA
_______________________________________________
HostAP mailing list
HostAP at shmoo.com
http://lists.shmoo.com/mailman/listinfo/hostap




More information about the HostAP mailing list