Resolution for 300 SMs Not Passing DNS Requests

We are still seeing 300 SM's not passing DNS requests, even when we have hard coded DNS servers into SOHO routers. I do not see anything in 4.4.3 release notes about this being resolved. Issue happens much less often than before, but still an issue.

SM's are in bridge mode with mgmt and data vlans. Some have static mgmnt addresse, some DHCP. SM's can still ping addresses on the internet, but SOHO router behind SM can not. 

I just had this happen at my home. Public IP address on home Tik with DNS servers hard coded into Tik (8.8.8.8 and 1.1.1.1).  When internet went down, SM could ping both 8.8.8.8 and 1.1.1.1, Tik could not. Tik could ping 8.8.4.4. Replaced 8.8.8.8 with 8.8.4.4 and internet immediately came back up on all devices. It has probably been a month since last time this happened at my home, at least while I was present.

We have other customers with 300 SM's complaining that even though we have hard coded DNS servers for them as well, they are still losing internet at times. Sometimes they will wait and it comes up, sometimes they reboot router and SM and it comes back up. SM's are not losing connection or unit rebooting on its on ( I have checked and confirmed if/when customer rebooted SM). It is also not interferrence. Most of these that complain are reaching MCS 8-9 almost 100% of the time. We have upload locked down to MCS 6-7, reaching those values 90+% of the time. We have a Dr., and a lady who VPN's to work from home who are not happy with this (both have signal in the 50's with almost no interference), even though it happens only a few times a month. The rest that have mentioned this to us, have not been too concerened, more like letting us know. I am not saying all of these are DNS issues, but they do seem to fit what I see at my home.

I witnessed it at my parents home about a month ago. Internet working fine for at least an hour while I was there, then getting "DNS probe finished, no internet" message on web pages. Their router does not have hard coded DNS servers. Less than 5 minutes and internet was back up and running with no intervention. Parents said it hardly ever happens to them, but they mainly stream and I have noticed it does not have an effect on gaming and streaming if DNS requests stop passing after a stream or gaming has started. In fact, an Xbox One was playing Fortnite here at my home today when internet went down and it kept on going while other devices stopped.   

I am sure Cambium is aware of this still happening and working on a fix...are the rest of you seeing this, and do any of you have a fix until there is an official release?

I have forwarded this to our product development and support teams.

I would create a couple of level 2 and 3 firewall rules on both interfaces (allowing port 53 and 8.8.8.8 and turn on the Logging option) on a couple of raidos having the problem the most frequently just to make the radio log that traffic. 

So then when it stops passing DNS you can look at the logs and see if the dns or 8.8.8.8 is hitting ethernet interface. If it is then see if it is passing the wireless interface or if creating the rules fixes the problem.

Note: When I did this to track down the radios doing something very similar with PPPoE discover I had to go to the System tab and on Syslog mask select all the options there before it would actually log the firewall activity I told it to log.

2 Likes

Beautiful!! I am going to accept this as a solution. May not fix the problem, but will let us know where it breaks.

Fedor also asked for tech support files from the AP's and SM's where we see this problem. The only ones I have actually witnessed this happening for sure is at my home and my parents. What the others have described seem to be the same as what I have witnessed but I can not definitely say it is the same. I have asked customers to call when it happens, but they do not. They call after internet is back working. 

As an update to this, I left ping tool open in winbox for my tik router at home. I would switch between pinging 8.8.4.4 and 1.1.1.1 from the Tik and I caught 8.8.4.4 not passing from router to internet. Internet stayed up as 1.1.1.1 was still reachable. A little while later 8.8.4.4 was reachable again.