PMP450b HG SM's & R22 are unable to connect to cnMaestro after 50+ days of uptime

I think we’ve found one, possibly two issues…

  1. After what we think is 50-53 days of uptime, PMP450b HG 3GHz and 5GHz SM’s, running firmware R22, AND running in NAT mode, will no longer be able to connect to cnMaestro. The cnMaestro agent will continue to try over and over but will not be able to connect. The only way to resolve the issue is to reboot the SM. This issue is apparent in more and more SM’s and we now have dozens of examples. The numbers grow everyday as we see the disconnected SM counts in cnMaestro increase every day.

  2. Some SM’s appear to not be connected to cnMaestro, but then, after logging into the SM, the SM will immediately connect to cnMaestro. cnMaestro will then show the SM as being online again. This issue is much more difficult to track, but we believe it’s related to the first issue.

I’m going to log a ticket with Cambium as well, but I’m curious if anyone else is seeing this issue. Thanks!

1 Like

:frowning: That’s not good. Any chance that it happens after 49.7 days? That would be a sign of a 32-bit millisecond counter wrapping.

2 Likes

Thanks for finding it and posting @Eric_Ozrelic.

You might be exactly right, @Simon_King. If you’re right, then 450i/450b/MicroPoP will all be seeing this issue after 49.7 days of uptime running 22.0.

We are investigating and will for sure have a fix in the next release.

In the meantime, if you’re past or approaching that uptime, you can reboot your device and it will reconnect.

1 Like

we noticed a similar issue as well on our network, but only on the SM’s that have the field blank or the field filled in with cloud.cambiumnetworks.com , if you put https://cloud.cambiumnetworks.com in and then run a ping from the radio to cloud.cambiumnetworks.com it generally shows back up online with no reboot. just went through and fixed about 40 today that had this issue. We also noticed on our cnpilot R195 200 and 201s that are deployed. and it was anything onboarded over 45 days that seemed to be affected not 49.7

1 Like

I can confirm we’re seeing a similar issue where after 53 days of uptime, the SMs show offline in cnMaestro but we can login to them via IP just fine. They’re all running NAT and were all fine before this time.

I would prefer not to open a ticket but can share tech files if needed to help resolve the problem. These are 450b 3ghz. The connection status on the SM just keeps showing reconnecting with the 5 minute countdown. They all have internet access.

2 Likes

Thanks for posting. We have the issue understood and fixed and it will be included in 22.1 BETA which we are hoping to release soon.

EDIT: I was wrong and this did not get fully resolved until 22.1.1 official release.

1 Like

Per the release notes, this was supposedly fixed in 22.0.1. We installed this version about 50 days ago and started to see this behavior for the first time over the last few days.

this seems to be resolved in 22.0.2.
22.0 and 22.0.1 both have the problem though. it affects the Access points and the Subscriber Modules

1 Like

We continue to see this behavior. Right at 50 days of uptime, SMs start dropping offline according to cnMaestro. We have everything running 22.0.2 (updated a little over 50 days ago).

Simply browsing to the SMs management interface restores the connection to cnMaestro, but only temporarily.

1 Like

These issues have been resolved in R22.1 which was released today and should be available in cnMaestro cloud shortly.

https://support.cambiumnetworks.com/files/pmp450/

2 Likes

We have installed new FW and still have issues with APs and SMs losing connection to Maestro. Any one else still having issues?

1 Like

The issue described above sounds different than what you’re describing (i.e. it took 50+ days to see the issue). This latest firmware (R22.1) was released yesterday.

Can you describe in more detail exactly what you’re seeing?
Have you installed R22.1?
Are you seeing both AP and SM devices losing connection to cnMaestro?
How does this manifest (is cnMaestro showing the devices as “down”)?
Do the devices recover? If so, is it just after a specific time period or are you performing an action (like resetting them) to recover them?

I would suggest opening a support ticket with the team to get more help on this.

1 Like

Here we go again; version 22.1 was installed on July 19th. As of yesterday, we’re starting to see SMs reporting 50 days of uptime losing connectivity to cnMaestro.

1 Like

I just checked and we have not seen any SM on R22.1 lose connection to cnMaestro going on 52 days now. EDIT… i just checked again and I’m seeing both 450 and 450b SM’s at 52+ days.

IIRC, this issue was with SM’s in NAT mode. Are yours in NAT mode?

Are the SM’s that are having issues 450, or 450b’s?

Have you filed a support ticket yet?

1 Like

52 days ago I updated 1 AP and all of its SMs. Since that time, 2 SMs (both 450b) have managed to stay powered up, while the others have been rebooted at some point. Both of those SMs are exhibiting the same behavior.

We are not using NAT.

In 3-5 days, the rest of our SM deployment will be coming up on 50 days since the 22.1 update. We’ll see how many more crop up as that happens as I’ll have a much larger sample of radios with 50 days of uptime.

I don’t feel a support ticket is in order just yet.

1 Like

I am also seeing this on APs that were recently upgraded to 22.1
cnMaestro reported three PMP450i APs as being offline this morning, but ping and user interface are up. Logging into the APs immediately restores the cnMaestro connection.
We upgraded these three APs 50 days ago. These APs don’t have SMs so they were our early test upgrades. The rest of our network was upgraded 45 days ago so we will monitor to see if we see a bunch of cnMaestro disconnections in 5 days.
We were experiencing similar issues prior to our 22.1 upgrade.
We’re not using NAT.
Thanks
Don

1 Like

jbettigole & dworkman,

Could you please share engineering.cgi files from the APs/SMs that had this issue.
Please send it to my email balaji.grandhi@cambiumnetworks.com.
Thanks for your help!

Files sent. Thanks Balaji!

1 Like

jbettigole & dworkman,

Thank you very much for sharing the engineering files. Appreciate your help!
We were able to recreate the issue in our lab. Dev. engineering team is investigating the issue.
Will update this thread when I have more info.

2 Likes

We filed a ticket on this issue as well #366685

2 Likes