Lockup in wpa supplicant

michaelm michael.melkonian at netcommwireless.com
Tue Jun 3 20:57:59 EDT 2014


On 30/05/14 17:19, Jouni Malinen wrote:
> On Fri, May 30, 2014 at 10:02:28AM +1000, michaelm wrote:
>> I am using wpa_supplicant 2.1 with linux wireless driver
>>
>> The problem I am having is that after a while, wpa_supplicant hangs in the function "radio_start_next_work" first giving the message:
>> "Delay radio work start until externally triggered scan completes"
> My first guess is that this is caused by a bug that was fixed after 2.1:
> http://w1.fi/cgit/hostap/commit/?id=1b5df9e591d9e97b69d7b2b69c295a5365f389c9
>
> How easily can you reproduce this issue? The log you included here did
> not have enough context to see what happened (the issue had happened
> before the start of that excerpt). I would suggest testing with the
> current hostap.git master branch snapshot since there has been number of
> fixes in to the radio work handling (in addition to that more specific
> fix that I mentioned above). If you can reproduce with the current
> snapshot, I would be interested in seeing a more complete debug log
> showing the issue.

The issue would happened reliably, reproducible by leaving the wpa_supplicant running on the device and disabling/re-enabling the upstream AP.

I will try to send you some more complete logs when I get a chance - however your bug suggestion as above appears correct and indeed it fixes the issue.
e.g after applying the fix from http://w1.fi/cgit/hostap/commit/?id=1b5df9e591d9e97b69d7b2b69c295a5365f389c9 I have not seen this problem reoccurring.

Thank you for pointing this out

>> Without full debugging, it seems that wpa_s->external_scan_running flag is set forever and hence wpa_supplicant never does any radio work.
>> I cannot really explain why there should be an externally triggered scan (as it would be the only condition to set the "external_scan_running" flag).
> That's not really true. The bug mentioned above resulted in a special
> scan case being triggered by wpa_supplicant as being assumed to be
> externally triggered.
>
>> I thought this may be to do with the fact that hostapd is running at the same time (as we use both STA and AP functionality simultaneously) but
>> this occurs even with AP function disabled and hostapd not running.
>>
>> If I remove the return, e.g.:
>> 	if (wpa_s && wpa_s->external_scan_running) {
>> 	    /*
>>              ** log message, but do not return and proceed with radio work
>>              */
>> 		wpa_printf(MSG_DEBUG, "<<<<Delay radio work start until externally triggered scan completes>>>>");
>> 		/* return; */
>> 	}
>>
>> Everything seems to work fine. Furthermore, it doesn't seem that interrupting external scan if supplicant needs to do its own is such a bad idea.
> That's not the proper way to fix this. If there had actually been a real
> scan pending, cfg80211 would be blocking any new scan attempts.
>
I was not suggesting this as a proper fix - merely making an observation that the issue went away with the change.
Thank you for pointing out the correct fix, as mentioned above it works


______________________________________________________________________
This communication contains information which may be confidential or privileged. The information is intended solely for the use of the individual or entity named above.  If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited.  If you have received this communication in error, please notify me by telephone immediately.
______________________________________________________________________


More information about the HostAP mailing list