More information (was Re: Event loop stall bug in hostapd-0.7.3)
Bryan.Phillippe at watchguard.com
Thu Jan 12 14:14:12 EST 2012
Oh, I forgot to post that:
[155186.946703] hostapd S 0fd16088 0 415 929 0x00000000
[155186.952994] Call Trace:
[155186.955541] [deab5ac0] [cdb0ada0] 0xcdb0ada0 (unreliable)
[155186.961050] [deab5b80] [c00089d0] __switch_to+0x9c/0xc4
[155186.966379] [deab5b90] [c03bfbb0] schedule+0x1e0/0x350
[155186.971621] [deab5be0] [c03c0250] schedule_timeout+0x158/0x19c
[155186.977567] [deab5c20] [c02cc184] sock_alloc_send_pskb+0x1a4/0x364
[155186.983857] [deab5c80] [c03aee7c] packet_sendmsg+0x7e4/0x9f4
[155186.989622] [deab5cf0] [c02c80cc] sock_sendmsg+0x90/0xc8
[155186.995039] [deab5dc0] [c02c8904] sys_sendmsg+0x234/0x2d8
[155187.000544] [deab5f10] [c02ca250] sys_socketcall+0x144/0x258
[155187.006309] [deab5f40] [c0011348] ret_from_syscall+0x0/0x3c
[155187.012008] --- Exception: c01 at 0xfd16088
[155187.012013] LR = 0x1004e430
So I guess sock_alloc_send_pskb() is where it's going to sleep. I'll keep digging into it.
As a test I did an fcntl(drv->monitor_sock, F_SETFL, O_NONBLOCK) on the socket at creation time and then made hostapd exit if there is ever a write error.
On Jan 12, 2012, at 11:05 AM, Ben Greear wrote:
> On 01/12/2012 10:40 AM, Bryan Phillippe wrote:
>> I learned some things about this while debugging it with a non-optimized version of hostapd. I believe that the appearance of the corrupted private data structure was due to the optimization in the debugger. That would explain why the sendmsg() is not immediately returning with an EBADF on the monitor_sock.
>> I think what's actually happening is that the sendmsg() on the monitor_sock is indeed blocking. I guess that could be a problem in the nl80211 driver in the kernel instead of something wrong with hostapd? I'm going to start investigating that side of it now. If you have any advice on that, please let me know.
> I think a sysrq kernel stack trace would be interesting, and should show where the
> kernel is blocked. I can see reads blocking, but it seems a bit strange that
> writes should block forever...
> Ben Greear <greearb at candelatech.com>
> Candela Technologies Inc http://www.candelatech.com
More information about the HostAP