[wpa_supplicant] essid with non-ascii characters
j.witteveen at gmail.com
Tue Jul 31 05:11:37 EDT 2012
On Mon, Jul 30, 2012 at 6:15 PM, Jouni Malinen <j at w1.fi> wrote:
> On Mon, Jul 30, 2012 at 02:10:51PM +0200, Jouke Witteveen wrote:
>> The network is supposedly named "Wifi Château d'Olonne", thus the
>> experiment shows that wpa_cli substitutes the 'â' with a '_'.
> If it happens to be encoded as a single character in the SSID.. It could
> also end up showing up as "__" if multi-byte encoding was used.
>> If it
>> would just output whatever bytes are in the SSID, the result of the
>> printf would be usable in shell scripts to connect to the network.
> SSID is not a string and it cannot be printed as such. It could even
> include things like '\0' in it. If you want to get the raw SSID binary
> data, you can get it from the beginning of the "ie" line in the BSS
> ctrl_iface command output as a hexdump (starting with two octet header
> if IE id/len).
This would be quite cumbersome and it means that the ssid=... part of
the bss output cannot be used used as the ssid=... part of a config
file. It would be convenient if the SSID reported by scan_results can
be copied to the config file in many cases. I don't really care about
SSID's containing '\0': network maintainers that choose to have such
SSID's deserve to face problems.
>> The only problems I see with outputting the SSID as-is, is with '\n'
>> and '\t'. Both mess up the output of `wpa_cli scan_results`. One way
>> to solve this problem is to have ' ' match all three of them (spaces,
>> newlines and tabs), another is by introducing escaping.
> The proper way of handling the SSID is to copy the exact binary data
> as-is rather try to pretend that it can be handled as text. As such, the
> scan_results output is not suitable for this purpose.
Using printf as in the experiment makes it possible to use
extraordinary text values:
printf "%q\n" "â"
I believe it works for non-printable characters too, so outputting
whatever octets make up the SSID (perhaps except for '\n', '\t', '\0')
makes sense to me.
I couldn't find the location in the code where octets become '_'.
More information about the HostAP