There is a bug we have discovered with the NAT on CPE. When the customer is behind the NAT their radio responds normally the entire time, however from the customers prespective connections slow, then stop all together and about 1 minute later are back to normal. We are seeing this repeating 5-10 times per day. Removing the NAT and running the CPE in bridge mode resolves this.
We are using 2 VLANs and the CPE has 2 IPs in our configuration. One IP for the NAT interface (public IP) one IP for the Mgmt interface (internal IP) when the downtime happens the private IP is still reachable, but the customer becomes unable to pass traffic. Again this is happening on 450 client, and 450b clients that we have tested across ALL APs in our network. Simple fix is to run a bridge config, but I figured I would report the bug so the engineers could review.
We have many thousands of customers on exact same configuration (nat with separate management vlan/ip) working fine. I'd investigate other possibilities for the issue as I'm sure it's working with many sms running 16.1 and 16.2.
What does dhcp log show when your sm no longer work in nat mode? Does the sm have a valid lease?
What is your customer's "Translation Table Size"? We have some customers who has his connection stopped working for a full nat table (see nat logs). usually for misconfigured Torrent's clients (sigh) or some malware/virus on some device.
I've monitored this from a customer reporting the issue and it never even reaches near the max, this was also my first guess. But the issue is too wide spread, it's happening to every single customer. But the tables that I monitored never filled when customers reported the issue.