Colibri iMX6ULL WiFi Access Point Issues

Summary

When restarting hostapd, uap0 interface MAC changes to same MAC as mlan0 which renders WiFi unuable. See this post too.

Details

This is 100% reproducible by doing the following:

  1. Install latest BSP (6.4 upstream) reference image minimal on Colibri iMX6ULL 512MB Wi-Fi / BT installed in Iris carrier.
  2. Based on the instructions here, start the Toradex-supplied example hostapd:
systemctl start hostapd-example
  1. Now restart the Toradex-supplied example hostapd:
systemctl restart hostapd-example
  1. And now check your MAC addresses of wireless interfaces and they are the same which leads to unusable WiFi.

See console log below:

root@colibri-imx6ull-14938994:~# ip -c a  | grep -A1 -e mlan0 -e uap0  
5: mlan0: <NO-CARRIER,BROADCAST,MULTICAST,DYNAMIC,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether e8:fb:1c:80:2e:af brd ff:ff:ff:ff:ff:ff
6: uap0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether e8:fb:1c:80:2c:af brd ff:ff:ff:ff:ff:ff

root@colibri-imx6ull-14938994:~# systemctl start hostapd-example
[  256.719760] IPv6: ADDRCONF(NETDEV_CHANGE): uap0: link becomes ready

root@colibri-imx6ull-14938994:~# ip -c a  | grep -A1 -e mlan0 -e uap0  
5: mlan0: <NO-CARRIER,BROADCAST,MULTICAST,DYNAMIC,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether e8:fb:1c:80:2e:af brd ff:ff:ff:ff:ff:ff
6: uap0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether e8:fb:1c:80:2c:af brd ff:ff:ff:ff:ff:ff
    inet 192.168.8.1/24 brd 192.168.8.255 scope global uap0
       valid_lft forever preferred_lft forever

root@colibri-imx6ull-14938994:~# systemctl stop hostapd-example

root@colibri-imx6ull-14938994:~# ip -c a  | grep -A1 -e mlan0 -e uap0  
5: mlan0: <NO-CARRIER,BROADCAST,MULTICAST,DYNAMIC,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether e8:fb:1c:80:2e:af brd ff:ff:ff:ff:ff:ff
6: uap0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether e8:fb:1c:80:2c:af brd ff:ff:ff:ff:ff:ff

root@colibri-imx6ull-14938994:~# systemctl start hostapd-example

root@colibri-imx6ull-14938994:~# ip -c a  | grep -A1 -e mlan0 -e uap0  
5: mlan0: <NO-CARRIER,BROADCAST,MULTICAST,DYNAMIC,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether e8:fb:1c:80:2e:af brd ff:ff:ff:ff:ff:ff
6: uap0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether e8:fb:1c:80:2e:af brd ff:ff:ff:ff:ff:ff permaddr e8:fb:1c:80:2c:af
    inet 192.168.8.1/24 brd 192.168.8.255 scope global uap0
       valid_lft forever preferred_lft forever

Hardware/Software Details:

HW Module: Colibri iMX6ULL 512MB Wi-Fi / Bluetooth IT (0040)
Carrier board: Iris V1.1B
SW: Reference Minimal Image, BSP 6.4 (UPSTREAM) installed via Toradex Easy Installer

Output of tdx-info:

root@colibri-imx6ull-14938994:~# tdx-info 

Software summary
------------------------------------------------------------
Bootloader:               U-Boot
Kernel version:           6.1.55-6.4.0+git.d23900f974e0 #1 SMP Sat Sep 23 09:11:13 UTC 2023
Kernel command line:      user_debug=30 ubi.mtd=ubi root=ubi0:rootfs rw rootfstype=ubifs ubi.fm_autoconvert=1 console=tty1 console=ttymxc0
Distro name:              NAME="TDX Wayland with XWayland Upstream"
Distro version:           VERSION_ID=6.4.0-build.8
Hostname:                 colibri-imx6ull-14938994
------------------------------------------------------------

Hardware info
------------------------------------------------------------
HW model:                 Toradex Colibri iMX6ULL 512MB on Colibri Evaluation Board V3
Toradex version:          0040 V1.1A
Serial number:            14938994
Processor arch:           armv7l
------------------------------------------------------------

Hi @The_Gman , sorry for the late response. I am looking into this issue and will get back to you soon.

Hi @The_Gman , this issue can be reproduced even on other SoM and I can confirm it. With a systemctl restart hostapd, hostapd.service did try to apply a correct MAC address(d0:c5:d3:34:25:17) on uap0.

root@colibri-imx6ull-06399358:~# systemctl status hostapd
â—Ź hostapd.service - Hostapd IEEE 802.11 AP, IEEE 802.1X/WPA/WPA2/EAP/RADIUS Authenticator
     Loaded: loaded (/lib/systemd/system/hostapd.service; enabled; vendor preset: disabled)
     Active: active (running) since Fri 2023-11-17 02:19:27 UTC; 6min ago
    Process: 501 ExecStart=/usr/sbin/hostapd /etc/hostapd.conf -P /run/hostapd.pid -B (code=exited, status=0/SUCCESS)
   Main PID: 502 (hostapd)
      Tasks: 1 (limit: 1018)
     Memory: 464.0K
     CGroup: /system.slice/hostapd.service
             └─502 /usr/sbin/hostapd /etc/hostapd.conf -P /run/hostapd.pid -B

Nov 17 02:19:27 colibri-imx6ull-06399358 systemd[1]: Starting Hostapd IEEE 802.11 AP, IEEE 802.1X/WPA/WPA2/EAP/RADIUS Authenticator...
Nov 17 02:19:27 colibri-imx6ull-06399358 hostapd[501]: Configuration file: /etc/hostapd.conf
Nov 17 02:19:27 colibri-imx6ull-06399358 hostapd[501]: Using interface uap0 with hwaddr d0:c5:d3:34:25:17 and ssid "my_ap"
Nov 17 02:19:27 colibri-imx6ull-06399358 hostapd[501]: random: Only 18/20 bytes of strong random data available
Nov 17 02:19:27 colibri-imx6ull-06399358 hostapd[501]: random: Not enough entropy pool available for secure operations
Nov 17 02:19:27 colibri-imx6ull-06399358 hostapd[501]: WPA: Not enough entropy in random pool for secure operations - update keys later when the first station connects
Nov 17 02:19:27 colibri-imx6ull-06399358 hostapd[501]: uap0: interface state UNINITIALIZED->ENABLED
Nov 17 02:19:27 colibri-imx6ull-06399358 hostapd[501]: uap0: AP-ENABLED
Nov 17 02:19:27 colibri-imx6ull-06399358 systemd[1]: hostapd.service: Can't open PID file /run/hostapd.pid (yet?) after start: Operation not permitted
Nov 17 02:19:27 colibri-imx6ull-06399358 systemd[1]: Started Hostapd IEEE 802.11 AP, IEEE 802.1X/WPA/WPA2/EAP/RADIUS Authenticator.

However, systemd-networkd on the other hand, set a wrong MAC (d0:c5:d3:34:27:17) on uap0. It leads to duplicated MAC address for mlan0 and uap0. We are actively working on this issue and will feedback ASAP.

root@colibri-imx6ull-06399358:~# ip a 

4: mlan0: <NO-CARRIER,BROADCAST,MULTICAST,DYNAMIC,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
    link/ether d0:c5:d3:34:27:17 brd ff:ff:ff:ff:ff:ff
5: uap0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether d0:c5:d3:34:27:17 brd ff:ff:ff:ff:ff:ff
    inet 192.168.8.1/24 brd 192.168.8.255 scope global uap0
       valid_lft forever preferred_lft forever
    inet6 fe80::d2c5:d3ff:fe34:2717/64 scope link 
       valid_lft forever preferred_lft forever

Thanks for your response @benjamin.tx

When you say:

hostapd.service did try to apply a correct MAC address(d0:c5:d3:34:25:17) on uap0

Are you referring to the following line in the hostapd journal?

Using interface uap0 with hwaddr d0:c5:d3:34:25:17 and ssid “my_ap”

I may be wrong, but I do not believe that hostapd is trying to explicitly set the MAC but that this log line means it has found the configured uap0 interface, and that (at this point) it still has the correct MAC. In other words I think the MAC address is being altered at some point after this log line, probably by some lower level network tool - my suspicion is somewhere around the following journal entry:

Nov 17 02:19:27 colibri-imx6ull-06399358 hostapd[501]: uap0: interface state UNINITIALIZED->ENABLED

I’m unsure how systemd-networkd interacts with hostapd.

Hi @The_Gman , thanks for your thought. You mentioned the duplicated MAC address leads to unusable WIFI, could you please elaborate on how it behaves on your device?

Hi @benjamin.tx,

You mentioned the duplicated MAC address leads to unusable WIFI, could you please elaborate on how it behaves on your device?

It’s a little hard to give you specific symptoms because they are not always 100% repeatable, however I will describe our application use case and give you some examples of instability below…

Some background about our application: we require the colibri-imx6ull to act as a WiFi station (i.e. connect to a WiFi network) and act as an AP (allow others to connect to it). We do not use internet sharing at this point, but rather restrict the AP network to be a small proviate network for clients to access the host. Hostapd and connman may need to be restarted from time to time, e.g. to accommodate configuration changes.

Some symptoms/issues which you might be able to reproduce:

  1. Very often (but not always), I am unable to connect to the AP. I’m 100% sure I have entered password correctly. If I restart hostapd (systemctl restart hostapd), then I am usually able to connect. This is the reason why I initially started restarting hostapd before noticing the duplicate MAC address issue. Have you ever had issues not being able to connect to hostapd?

  2. Our custom applications, and connman too, use the MAC address of the interface as part of the unique hash/identifier of the service (wifi_{MAC}_{SSID_HEX}_managed_psk) and so having duplicates is not viable; e.g. some sort of problem is almost always triggered when restarting connman (systemctl restart connman) when there are duplicate MAC’s. Unfortunately the symptoms are not always the same, but below you can see a kernel NULL pointer exception.

  3. Another time after restarting connman a connmanctl scan wifi command returns immediately without error but connmanctl services command shows an empty list. The only way to recover is to reboot. One time the device did not even reboot - it just hung.

Console output for case 2 where kernel had NULL pointer exception:

root@colibri-imx6ull-14938994:~# systemctl restart connman
[  700.287213] mwifiex_sdio mmc1:0001:1: info: successfully disconnected from a4:2b:b0:d6:16:18: reason code 3
root@colibri-imx6ull-14938994:~# [  700.886935] Micrel KSZ8041 20b4000.ethernet-1:02: attached PHY driver (mii_bus:phy_addr=20b4000.ethernet-1:)

root@colibri-imx6ull-14938994:~#
root@colibri-imx6ull-14938994:~# [  702.195590] mwifiex_sdio mmc1:0001:1: info: trying to associate to bssid a4:2b:b0:d6:16:18
[  702.228477] 8<--- cut here ---
[  702.231698] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[  702.240180] [00000000] *pgd=00000000
[  702.243932] Internal error: Oops: 5 [#1] SMP ARM
[  702.252300] Modules linked in: bnep mcp251x mwifiex_sdio mwifiex cfg80211 btmrvl_sdio btmrvl imx_sdma fuse
[  702.269396] CPU: 0 PID: 524 Comm: kworker/0:2 Not tainted 6.1.55-6.4.0+git.d23900f974e0 #1
[  702.285178] Hardware name: Freescale i.MX6 Ultralite (Device Tree)
[  702.295178] Workqueue: events sdio_irq_work
[  702.303122] PC is at mmiocpy+0x4c/0x334
[  702.310629] LR is at mwifiex_save_curr_bcn+0x5c/0x158 [mwifiex]
[  702.320497] pc : [<c0b73a6c>]    lr : [<bf0b1264>]    psr: 20030013
[  702.330422] sp : e0ce1e38  ip : 00000000  fp : e0ce1e48
[  702.339307] r10: 00000001  r9 : c2c2c108  r8 : c2fa5500
[  702.348141] r7 : c28aa000  r6 : 00000000  r5 : c2c2d000  r4 : c2c2c000
[  702.358241] r3 : 00000000  r2 : 00000166  r1 : 00000000  r0 : c317b400
[  702.368263] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[  702.378890] Control: 10c5387d  Table: 82af006a  DAC: 00000051
[  702.388082] Register r0 information: slab kmalloc-512 start c317b400 pointer offset 0 size 512
[  702.403505] Register r1 information: NULL pointer
[  702.411573] Register r2 information: non-paged memory
[  702.419902] Register r3 information: NULL pointer
[  702.427776] Register r4 information: non-slab/vmalloc memory
[  702.436547] Register r5 information: non-slab/vmalloc memory
[  702.445194] Register r6 information: NULL pointer
[  702.452782] Register r7 information: slab kmalloc-8k start c28aa000 pointer offset 0 size 8192
[  702.467131] Register r8 information: slab kmalloc-256 start c2fa5500 pointer offset 0 size 256
[  702.481583] Register r9 information: non-slab/vmalloc memory
[  702.490226] Register r10 information: non-paged memory
[  702.498350] Register r11 information: 2-page vmalloc region starting at 0xe0ce0000 allocated at kernel_clone+0x88/0x338
[  702.515031] Register r12 information: NULL pointer
[  702.522710] Process kworker/0:2 (pid: 524, stack limit = 0x824397f4)
[  702.531959] Stack: (0xe0ce1e38 to 0xe0ce2000)
[  702.539147] 1e20:                                                       c2c2d000 00000000
[  702.552897] 1e40: c2fa5500 c2c2c108 c317b400 c2c2c000 00000000 bf0b1264 c2c2c000 c2c2c22c
[  702.566763] 1e60: 00000000 bf0b290c 00000000 00000000 c28aa000 c2c2c000 c2c2c000 c28aa000
[  702.580793] 1e80: c2b39c84 00000012 00008012 c28aaac0 c28ab000 bf0a6780 00000000 00000076
[  702.595070] 1ea0: 000000e7 7fb1dae4 00000000 c28aa000 c28aa174 00000000 c28aaae8 c1203d40
[  702.609466] 1ec0: 00000000 c0f404d8 00000000 bf0a2434 00000001 00000000 c2583000 c2f8e900
[  702.623917] 1ee0: c25833c4 c0f404a8 c0f404d8 c0890c68 00000000 e0ce1efa 00020122 7fb1dae4
[  702.638550] 1f00: e0ce1f54 c24f9aec c24f9800 d3b769c0 d3b79c00 00000000 c2f8e900 d3b79c05
[  702.653554] 1f20: 00000000 c0890f3c c24f9aec c2a09780 d3b769c0 c013e6b0 d3b769c0 d3b769c0
[  702.668959] 1f40: d3b769dc c2a09780 d3b769c0 c2a09798 d3b769dc c1203d40 00000008 c2f8e900
[  702.684553] 1f60: d3b769c0 c013e8f8 00000000 c2c6f440 c2f8e900 c013e8bc c2a09780 c2ce10c0
[  702.700149] 1f80: e0881ebc 00000000 00000000 c01467a8 c2c6f440 c01466e0 00000000 00000000
[  702.715859] 1fa0: 00000000 00000000 00000000 c010016c 00000000 00000000 00000000 00000000
[  702.731736] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  702.747733] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[  702.763888]  mmiocpy from mwifiex_save_curr_bcn+0x5c/0x158 [mwifiex]
[  702.774639]  mwifiex_save_curr_bcn [mwifiex] from mwifiex_ret_802_11_associate+0x1c4/0x444 [mwifiex]
[  702.792306]  mwifiex_ret_802_11_associate [mwifiex] from mwifiex_process_cmdresp+0x2d0/0x3b8 [mwifiex]
[  702.810144]  mwifiex_process_cmdresp [mwifiex] from mwifiex_main_process+0x418/0xa18 [mwifiex]
[  702.827270]  mwifiex_main_process [mwifiex] from process_sdio_pending_irqs+0xf8/0x1f4
[  702.843447]  process_sdio_pending_irqs from sdio_irq_work+0x3c/0x64
[  702.853844]  sdio_irq_work from process_one_work+0x1dc/0x3e8
[  702.863599]  process_one_work from worker_thread+0x3c/0x538
[  702.873215]  worker_thread from kthread+0xc8/0xe8
[  702.881903]  kthread from ret_from_fork+0x14/0x28
[  702.890509] Exception stack(0xe0ce1fb0 to 0xe0ce1ff8)
[  702.899402] 1fa0:                                     00000000 00000000 00000000 00000000
[  702.915097] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[  702.930730] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[  702.941134] Code: ba000002 f5d1f03c f5d1f05c f5d1f07c (e8b15378)
[  702.951116] ---[ end trace 0000000000000000 ]---

root@colibri-imx6ull-14938994:~#

Another example of instability and weirdness (this time more repeatable):

  1. Reboot to start clean (after reboot MAC addresses are unique as expected)
  2. Connect to a WiFi network on mlan0 using connmanctl interactive interface as described in this link if this is the first time, otherwise it should autoconnect after reboot.
  3. Start hostapd (systemctl start hostapd-example)
  4. Connect to the AP with my phone - successfully connects.
  5. Restart hostapd (systemctl restart hostapd-example):
    • Triggers Duplicate MAC address bug
    • My phone disconnects
    • Trying to reconnect to AP on phone fails, every time!
    • WiFi connection via mlan0 still works (can ping hosts on network)
  6. Restart connman (systemctl restart connman):
    • Now MAC addresses of mlan0 and uap0 are different (=GOOD), BUT not the same as after reboot! mlan0 MAC address is always the same its only ever uap0 that changes
    • Now both interfaces are DOWN and disconnected
  7. Try reconnect to WiFi using connmanctl interactive interface:
    • Connman error “Error /net/connman/service/wifi_e8fb1c802eaf_494d532d446576_managed_psk: In progress”
  8. Resart connamn and hostapd a few more times:
    • eventually end up in a loop where mwifiex_sdio associated to WiFi and then immediately disconnects (reason code 2):
    [  423.465876] mwifiex_sdio mmc1:0001:1: info: trying to associate to bssid a4:2b:b0:d6:16:18                 
    [  423.506899] mwifiex_sdio mmc1:0001:1: info: associated to bssid a4:2b:b0:d6:16:18 successfully             
    [  426.508964] mwifiex_sdio mmc1:0001:1: info: successfully disconnected from a4:2b:b0:d6:16:18: reason code 2
    [  433.502513] mwifiex_sdio mmc1:0001:1: info: trying to associate to bssid a4:2b:b0:d6:16:18                 
    [  433.545709] mwifiex_sdio mmc1:0001:1: info: associated to bssid a4:2b:b0:d6:16:18 successfully             
    [  436.544606] mwifiex_sdio mmc1:0001:1: info: successfully disconnected from a4:2b:b0:d6:16:18: reason code 2
    
    

I’ve attached full output of console while doing the steps above and the WiFi instability should be clear.
wifi-issues-console-output-toradex-issue-20916.txt (60.9 KB)

Retrying the procedure of the previous post and I got a kernel NULL pointer exception after step 6 (restart connman).

Hi @The_Gman , In addition to the image with NXP proprietary wireless driver, there is also a patch to fixed this issue for the upstream kernel if you would like the open source wireless driver.

Excellent news.

I will hopefully have some dedicated time to test this next week.

Will Toradex be backporting this to it’s LTS BSP’s?

1 Like

yes, this patch will be merged to the next release.

Hi @benjamin.tx

Have you tested this patch and that it solves the original issue? I tried applying the patch to my setup and it did not make a difference.

Hi @The_Gman , please test with our latest nightly image Linux BSP v6.5.0 which has the patch integrated.
Here is the log from my test with Linxu BSP 6.5.0-devel-20231229-build.483.

root@colibri-imx6ull-06399358:~# ip -c a s uap0 | grep ether
    link/ether d0:c5:d3:34:25:17 brd ff:ff:ff:ff:ff:ff
root@colibri-imx6ull-06399358:~# ip -c a s mlan0 | grep ether
    link/ether d0:c5:d3:34:27:17 brd ff:ff:ff:ff:ff:ff
root@colibri-imx6ull-06399358:~# 
root@colibri-imx6ull-06399358:~# 
root@colibri-imx6ull-06399358:~# systemctl restart hostapd-example
root@colibri-imx6ull-06399358:~# 
root@colibri-imx6ull-06399358:~# 
root@colibri-imx6ull-06399358:~# ip -c a s mlan0 | grep ether
    link/ether d0:c5:d3:34:27:17 brd ff:ff:ff:ff:ff:ff
root@colibri-imx6ull-06399358:~# ip -c a s uap0 | grep ether
    link/ether d0:c5:d3:34:25:17 brd ff:ff:ff:ff:ff:ff
root@colibri-imx6ull-06399358:~# 
root@colibri-imx6ull-06399358:~# 
root@colibri-imx6ull-06399358:~# systemctl restart hostapd-example
root@colibri-imx6ull-06399358:~# ip -c a s mlan0 | grep ether
    link/ether d0:c5:d3:34:27:17 brd ff:ff:ff:ff:ff:ff
root@colibri-imx6ull-06399358:~# ip -c a s uap0 | grep ether
    link/ether d0:c5:d3:34:25:17 brd ff:ff:ff:ff:ff:ff