mac80211 + hostapd: EAPOL frames rate selection

Mon Dec 17 12:26:23 EST 2012

On Fri, Jul 29, 2011 at 11:05 AM, Felix Fietkau <nbd at openwrt.org> wrote:
> On 2011-07-29 7:55 PM, Helmut Schaa wrote:
>>
>> On Fri, Jul 29, 2011 at 7:37 PM, Jouni Malinen<j at w1.fi>  wrote:
>>>
>>>  On Fri, Jul 29, 2011 at 11:14:20AM +0200, Helmut Schaa wrote:
>>>>
>>>>  I just noticed that EAPOL frames generated by hostapd during the 4-way
>>>>  handshake are sent out by mac80211 using a rate as selected by the rc
>>>>  algorithm for data frames. In my case minstrel_ht selects a MCS rate
>>>> for
>>>>  11n clients which sometimes results in a 4-way handshake timeout under
>>>>  low signal conditions.
>>>
>>>
>>>  That sounds like an issue that should be fixed in the rate control
>>>  algorithm if it is indeed using unsuitable rate immediately after
>>>  association. Dropping data frames completely is not really a good thing
>>>  regardless of whether they are EAPOL packets or not..
>>
>>
>> True. Nevertheless other drivers like madwifi or the ralink legacy drivers
>> also
>> force EAPOL frames to a low rate. That's why I had the idea in the first
>> place.
>
> I think this mainly occurs on devices/drivers that use minstrel_ht, but lack
> proper multi-rate retry. minstrel_ht may pick a high rate for probing and on
> devices with a simple fallback table (e.g. rt2x00), it may give up on the
> frame before having tried a low enough rate.
>
> On ath9k this shouldn't be an issue with minstrel_ht, because it always
> keeps the max_prob_rate in a retry slot.

[reviving an old thread]

The problem is that with other algorithms (like the ath9k built-in
rate control algorithm) this may not be the case.  Even in the case
where a "probable" rate is left in the the bottom retry slot, in a
congested network that may just mean you end up with one "reasonable
probability" shot -- not exactly foolproof.  Based on experiences
we've been gaining on various classroom deployments and home scenarios
in ChromeOS, we've found that EAP/EAPOL frames as well as DHCP are
particularly vulnerable to creating large disruptions in connectivity
-- either significantly lengthening or completely stymying connection
attempts.  DNS is somewhat vulnerable as well, but the penalty for
failure there, though user-visible, is more mitigable in the
application.  Due to the small number of outgoing retries in all of
these algorithms and the penalty for failure, it seems like we should
do a better job of giving such attempts a chance to succeed.

I'd like to gauge whether adding flag to supplicant / cfg80211 to
explicitly curtail "optimism" on the part of the rate control
algorithm during this period of time would be acceptable upstream.
This flag would cause rate control to shift more of the retries into
more conservative bit-rates, and would be cleared by the connection
manager through wpa_supplicant in a similar manner to the "authorized"
flag used to curtail outbound traffic until authentication completes.
Depending on the underlying driver implementation the response could
be (a) do nothing (my rate control algorithm is awesome an all tx'ed
packets _always_ go through), (b) set a fixed retry rate sequence, (c)
change parameters in the rc algorithm to dive deeper (and stay deeper)
into the rate table on transmit failure.

--
Paul
>
> - Felix
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html