LIFX (and other IOT) connection issues

I am having issues with LIFX wifi bulbs and Cambium e400 WAPs.

To set up the bulbs, you need to use the LIFX mobile app. Trying to set up the bulbs takes multiple efforts but I can usually eventually get it to work. Once set up, the bulbs usually work well for a while. But periodically they refuse to connect to the network and I have extensive issues trying to get them to reconnect. There might be some kind of cache issue, since changing the wifi SSID to something new usually seems to help.

I've reviewed the error logs from the WAPs. It looks like the client repeatedly tries to connect, something happens, connection fails, then repeat. The following seems to be a relevant extract (the MAC address is for a LIFX bulb, vlan should be 24, and SSID skyfiIOT__ (suffixes added to change SSID only). I've confirmed my router + WAP config  does setup does work, however not reliably.

Jun 21 13:45:35: wifid : Client[D0-73-D5-12-7E-12] on ssid skyfiIOTau internal cache based vlan is 0 (apd.c:127)
Jun 21 13:45:35: wifid : Client[D0-73-D5-12-7E-12] on ssid skyfiIOTau assigned WLAN vlan 24 (apd.c:141)
Jun 21 13:45:35: wifid : client [D0-73-D5-12-7E-12] on ssid skyfiIOTau vlan 24 state sync (type:1) sent, len[128] (cache.c:1767)
2020-06-21 13:45:35 573 wifi.c:1307:set_log_level: syslog severity=7
2020-06-21 13:45:35 573 wifi.c:1949:event_rx_cb: Trap Data received, send to cnMaestro len=230 data [{"msgType": 699, "eId": "WIFI_CLIENT_CONNECTED",Jun 21 13:45:35: scmd : Jun 21 13:45:35 WIFI-6-CLIENT-CONNECTED Client [D0-73-D5-12-7E-12] connected to wireless lan [skyfiIOTau] (mai
Jun 21 13:45:35: snmpd : system trap has been sent (snmpd.c:1382)
Jun 21 13:45:35: wifid : client[D0-73-D5-12-7E-12] on vap_id[4] wlan_id[4] created in coplane (stats.c:227)
Jun 21 13:45:35: wifid : client[D0-73-D5-12-7E-12] ga[0]:ci[0] sta ssid[4][skyfiIOTau]:cache ssid[skyfiIOTau] (cache.c:2550)
Jun 21 13:45:35: wifid : 2020-06-21 13:45:47 573 log.c:207:start_cns_logging: Send log history (10 lines)
Jun 21 13:45:35: wifid : Client[D0-73-D5-12-7E-12] on ssid skyfiIOTau internal cache based vlan is 0 (apd.c:127)
Jun 21 13:45:35: wifid : radio_idx=0, bss_idx=4, APD_NUM_RADIOS=2, APD_NUM_BSS_PER_RADIO=16 (apd.c:539)
Jun 21 13:45:35: wifid : client[D0-73-D5-12-7E-12] hostapd vlan=24 and wlan vlan=24 (cache.c:1678)
Jun 21 13:45:35: wifid : client [D0-73-D5-12-7E-12] on ssid skyfiIOTau vlan 24 state sync (type:1) sent, len[128] (cache.c:1767)
Jun 21 13:45:35: wifid : client[D0-73-D5-12-7E-12] on vap_id[4] wlan_id[4] created in coplane (stats.c:227)
Jun 21 13:45:35: wifid : client[D0-73-D5-12-7E-12] ga[0]:ci[0] sta ssid[4][skyfiIOTau]:cache ssid[skyfiIOTau] (cache.c:2550)
Jun 21 13:45:35: wifid : client[D0-73-D5-12-7E-12] no hotspot session found (hotspot.c:977)
Jun 21 13:45:35: wifid : apd: Client[d0:73:d5:12:7e:12] sent 0 PMKID in (Re)association request (log.c:51)
Jun 21 13:45:35: wifid : apd: get vlan: 24 (log.c:51)
Jun 21 13:45:35: wifid : Client D0-73-D5-12-7E-12 moved to data ready state (stats.c:581)
2020-06-21 13:45:47 573 log.c:207:start_cns_logging: Send log history (10 lines)
Jun 21 13:45:35: wifid : client D0-73-D5-12-7E-12 ci_sess current state[NOT_IN_USE] moved to new state[IN_USE] (stats.c:505)
Jun 21 13:45:35: wifid : client D0-73-D5-12-7E-12 ci_sess current state[IN_USE] moved to new state[AUTHENTICATED] (stats.c:505)
Jun 21 13:45:35: wifid : radio_idx=0, bss_idx=4, APD_NUM_RADIOS=2, APD_NUM_BSS_PER_RADIO=16 (apd.c:539)
Jun 21 13:45:35: wifid : client[D0-73-D5-12-7E-12] hostapd vlan=24 and wlan vlan=24 (cache.c:1678)
Jun 21 13:45:35: wifid : radio_idx=0, bss_idx=4, APD_NUM_RADIOS=2, APD_NUM_BSS_PER_RADIO=16 (apd.c:539)
Jun 21 13:45:35: wifid : client [D0-73-D5-12-7E-12] on ssid (null) vlan 24 state sync (type:6) sent, len[672] (cache.c:1767)
2020-06-21 13:45:35 573 wifi.c:1307:set_log_level: syslog severity=7
2020-06-21 13:45:35 573 wifi.c:1949:event_rx_cb: Trap Data received, send to cnMaestro len=495 data [{"msgType": 699, "eId": "WIFI_CLIENT_DISCONNECTEJun 21 13:45:35: scmd : Jun 21 13:45:35 WIFI-6-CLIENT-DISCONNECTED Client [D0-73-D5-12-7E-12] disconnected from WLAN [sk2020-06-21 13:45:47 573 log.c:207:start_cns_logging: Send log history (10 lines)
Jun 21 13:45:35: wifid : client[D0-73-D5-12-7E-12] hotspot session updated ga_allowed[0] to coplane (hotspot.c:401)
Jun 21 13:45:35: wifid : radio_idx=0, bss_idx=4, APD_NUM_RADIOS=2, APD_NUM_BSS_PER_RADIO=16 (apd.c:539)
Jun 21 13:45:35: wifid : client[D0-73-D5-12-7E-12] hostapd vlan=24 and wlan vlan=24 (cache.c:1678)
Jun 21 13:45:35: wifid : client [D0-73-D5-12-7E-12] on ssid (null) vlan 24 state sync (type:6) sent, len[672] (cache.c:1767)
Jun 21 13:45:35: wifid : client[D0-73-D5-12-7E-12] on vap_id[4] wlan_id[4] deleted in coplane (stats.c:227)
Jun 21 13:45:35: wifid : Disconnect Detail: mac=D0-73-D5-12-7E-12 rssi=47 reason=client-sent-deauth-with-frame-8-rssi--64 (log.c:190)
Jun 21 13:45:35: wifid : station D0-73-D5-12-7E-12 disconnected, reason code=8:Station Left BSS (stats.c:2199)
Jun 21 13:45:35: snmpd : system trap has been sent (snmpd.c:1382)

Googling the "reason code 8" error suggests that the issue might be redirection to another AP (I have 3). However, I have tried separate IOT SSIDs for each of my WAPs (skyFiIOTa, b, c etc.). That does seemt to help with connecting, but doesn't improve long term stability.

The LIFX bulbs connect fine to consumer/ISP supplied routers so I'm reasonably confident the issue is with my config somewhere.

I've also had some issues with other IOT-type devices, I suspect based on the ESP8266 platform but I've gotten rid of those.

Do you have band steering turned enabled on this WLAN? Try disabling it. LIFX bulbs don't seem to play well with that feature.  

Thanks for the response. I've already got band steering disabled and radio is 2.4GHz only for the IOT WLAN and the problem is continuing.

Try disabling "Respond to ARP requests automatically on behalf of clients"

hi,

in the ssid can you disable "

Unicast DHCP
 Convert DHCP-OFFER and DHCP-ACK to unicast before forwarding to clients
from WLAN advanced settings and monitor the situation.
 
can you share ap firmeare version also.

Thanks all for the help.

I _seem_ to have got everything working and stable (touch wood) but unfortunately not deliberately. I ended up having to replace my router and reconfigured all my wireless networks from scratch at the same time. Turned out to be a useful exercise since it seems to have resolved the LIFX issue too. But it did involve several changes all at once so not exactly a scientific method of resolution and appreciate that is super unhelpful for anyone else that has a similar issue and comes across this thread.

Just from comparing old configs and new configs, I think the most likely resolution was one or both of the following in the radio config:

- Turning up minimum unicast rate to 5.5 (I don't know why I had it lower to start with. Previous experimenting? Default config?)

- Turning off enhanced roaming

But I don't really at all know what I'm doing so may well have been something else. I base this guess off the error code 8 suggesting that the issue might be redirection to another AP. I did try separate IOT SSIDs for each of my WAPs at one stage and that didn't help.

From memory, I had previously tried toggling both of:

- Respond to ARP requests automatically on behalf of clients

- Unicast DHCP

But still had issues. Currently both are on and everything is working fine.

Other changes in the network rebuild include:

- Change EdgeRouter Lite to Cambium R201 - could well have been something odd in my ERL config.

- Getting rid of VLANs - I used these primarily for content control (adults, kids, IOT, guest) but am testing the RouterLimits option in the R201 (impressed so far). Could well have been something odd in my VLAN config, thought that seems less likely.

- Generally following the recommended base config unless I actually knew why I would want to change something.

Thanks again for the help. 

Good to hear. More cnfg items got played difficult to point which is the right cnfg that solved.

What is the aveage SNR that we in the AP dash borad for these clients? 

Looks like the average SNR is around 20 dB! I've just been through the list and most of the bulbs don't seem to be connected to the closest WAP. They must be pretty sticky. The one bulb that is connected to the "right" WAP is more like 40 dB.