More information (was Re: Event loop stall bug in hostapd-0.7.3)

Jouni Malinen j at w1.fi
Thu Jan 12 00:51:59 EST 2012


On Mon, Jan 09, 2012 at 10:59:17PM +0000, Bryan Phillippe wrote:
> Well, I was able to debug this problem more during a repro today.  I found a lot of information.  Basically, we're stuck in wpa_driver_nl80211_send_frame() from src/drivers/driver_nl80211.c here:

How easily can you reproducer this? What platform (CPU, etc.) do you
have on the AP? Would you be able to run hostapd under valgrind by any
chance?

> The sendmsg() is blocked on the monitor_sock, which is apparently blocking IO and unable to send for some reason.

I don't think that this is the real issue - the real issue is that
something got corrupted before the call:

> More information:
> 
> (gdb) p *(struct wpa_driver_nl80211_data *)0x100727d0
> $6 = {ctx = 0x10072700, netlink = 0x0, ioctl_sock = 13,
>   brname = "ath1", '\000' <repeats 11 times>, ifindex = 8388608,
>   if_removed = 21, capa = {key_mgmt = 16, enc = 13, auth = 18, flags = 16,
>     max_scan_ssids = 16, max_remain_on_chan = 16}, has_capability = 0,
>   operstate = 0, scan_complete_events = 0, nl_sock = 0x0, nl_sock_event = 0x0,
>   nl_cache = 0x0, nl_cache_event = 0x0, nl_cb = 0x0, nl80211 = 0x0,

It's reasonably fine until here, but then, rest of the structure is
quite wrong:

>   auth_bssid = "\000\000\000\000\020\a", bssid = "'\364\000\000\000\020",
>   associated = 0,
>   ssid = '\000' <repeats 15 times>, "\021\020\a'\000\020\003M\020\020\003(@\000\000\001I", ssid_len = 268904872, nlmode = 268904872, ap_scan_as_station = 1,
>   assoc_freq = 13107200, monitor_sock = 2346, monitor_ifidx = 2346,
>   probe_req_report = 17432576, disable_11b_rates = 1,
>   pending_remain_on_chan = 0, added_bridge = 0, added_if_into_bridge = 0,
>   remain_on_chan_cookie = 0, send_action_cookie = 1154435205700780032,
>   filter_ssids = 0x0, num_filter_ssids = 0, first_bss = {drv = 0x0,
>     next = 0xffffffff, ifindex = 0,
>     ifname = '\000' <repeats 12 times>"\377, \377\377\377", beacon_set = 0},
>   eapol_sock = 0, default_if_indices = {0, 0, -1, 0, 0, 0, 0, -1, 0, 0, 0, 0,
>     -1, 0, 0, 0}, if_indices = 0x0, num_if_indices = -1, last_freq = 0,
>   last_freq_ht = 0}

I don't remember seeing this type of issue. Would you be able to test
the current development snapshot from hostap.git master branch? It would
be interesting to see whether this could have already been addressed.
valgrind could also be able to pinpoint the actually place where the
structure gets corrupted.
 
-- 
Jouni Malinen                                            PGP id EFC895FA


More information about the HostAP mailing list