Please upgrade ePMP AP/SM devices to 2.5.2-RC6


@rupamkhaitan wrote:

What is the current version running in your SM?

Is your SM connected to cnMaestro when you tried to upgrade it?


They were all running 2.5.1. 

If the upgrade failed, I logged into the radio and rebooted it. Usually the upgrade would then work. Sometimes it needed a second reboot. All upgraded now.

Edit: I also got a few timeouts. They needed several minutes to re-connect to cnMaestro. Some required another flash, some required a reboot and a flash. The process was not seamless, but I'm looking forward to the new stability.

Those of mine that have failed so far (all that I've tried have failed) have been running 2.5.1.  I just added another batch and am awaiting the outcome.  I'll reboot the radios manually if needed to see if that forces the upgrade.

I've chosen one SM to focus on.  It has now failed the update 3 times in as many manual reboots of the radio.  I have waited each time to initiate the update until it connects to cnMaestro and shows as online.

In the syslog of the radio I see this if it is helpful.  This is post-reboot and update attempt:

Sep  1 00:00:27 hc-***-c110 DEVICE-AGENT[1931]: get_stats_block: connect() failed errno=2
Sep  1 00:00:28 hc-***-c110 DEVICE-AGENT[1931]: get_stats_block: connect() failed errno=2
Sep  1 00:00:28 hc-***-c110 DEVICE-AGENT[1931]: get_stats_block: connect() failed errno=2
Sep  1 00:00:29 hc-***-c110 DEVICE-AGENT[1931]: get_stats_block: connect() failed errno=2
Sep  1 00:00:29 hc-***-c110 DEVICE-AGENT[1931]: get_stats_block: connect() failed errno=2
Sep  1 00:00:29 hc-***-c110 DEVICE-AGENT[1931]: get_stats_block: connect() failed errno=2
Sep  1 00:00:30 hc-***-c110 DEVICE-AGENT[1931]: get_stats_block: connect() failed errno=2
Sep  1 00:00:30 hc-***-c110 DEVICE-AGENT[1931]: get_stats_block: connect() failed errno=2
Sep  1 00:00:30 hc-***-c110 DEVICE-AGENT[1931]: get_stats_block: connect() failed errno=2
Sep  1 00:00:50 hc-***-c110 DEVICE-AGENT[1931]: EINPROGRESS in connect()
Sep  1 00:00:50 hc-***-c110 DEVICE-AGENT[1931]: SSL_ERROR_WANT_READ try again
Sep  1 00:00:50 hc-***-c110 DEVICE-AGENT[1931]: SSL_ERROR_WANT_READ try again
Sep  1 00:00:50 hc-***-c110 DEVICE-AGENT[1931]: Server certificate is verified and it is valid
Sep  1 00:00:50 hc-***-c110 DEVICE-AGENT[1931]: Received Headers : "HTTP/1.1 302 Moved Temporarily
Sep  1 00:00:55 hc-***-c110 DEVICE-AGENT[1931]: SMs pmac [00:04:56:C9:C7:D3]
Sep  1 00:00:57 hc-***-c110 DEVICE-AGENT[1931]: callback_websocket: LWS_CALLBACK_CLIENT_ESTABLISHED
Sep  1 00:00:57 hc-***-c110 DEVICE-AGENT[1931]: handle_cns_msg: MSG_REGISTER_SUCCESS received
Nov 13 23:19:01 hc-***-c110 DEVICE-AGENT[1931]: Not received PONG for the last ping
Nov 13 23:22:15 hc-***-c110 DEVICE-AGENT[1931]: platform_sw_update: ENTRY
Nov 13 23:22:41 hc-***-c110 DEVICE-AGENT[4675]: do_sw_update2: Exited while loop sw_upd state=0

Then it fails again. I've spent all the time on this I can for now. 

We found some internal issue in our server and this could be one of the reason for upgrade failure as sometime device was not able to connect.

We have fixed it and it should get upgraded.

Please try again and do let us know.

Re-attempted using the original SM. New error this time, "Skipped the device to update as it went offline."

Rebooted the SM. Prior to reboot, shows 2d21h System Uptime, shows Connected to Maestro in the SM, and shows green and conected in Maestro.
After reboot, Green in Maestro, Connected in the SM, smooth double-digit pings to the SM. Started upgrade again. Pings ran for several minutes, then dropped just as the upgrade failed with the error "Device got timed out during update. Last known status was: Sent the software update command to the device."

==
Attempted to update an AP with a single SM. AP failed with "Device got timed out during update, last known status was: Sent the software update command to the device." AP dropped no pings during the update attempt. SM then showed the same error.

Not showing any improvements yet on this end.

Can you share your device MAC at rkh001@cambiumnetworks.com so I can check the detailed logs if there was any disconnect from our end?

In order to make sure everything works, we tried to update couple of our device from 2.5.1 to 2.5.2-RC6 now and everything went smooth and I dont see any timeout or error.

If we can do a screenshare or hangout session it would really help us debug your issue.

Please share us your details at rkh001@cambiumnetworks.com so that I can get in touch with you.

I'm still seeing "Device got timed out during update" when attempting to update my radio at home (the only one I can do during business hours on this sector). Mac is  00:04:56:CE:75:8D.

Same results except it don't kick it offline in mastro.


@rupamkhaitan wrote:

In order to make sure everything works, we tried to update couple of our device from 2.5.1 to 2.5.2-RC6 now and everything went smooth and I dont see any timeout or error.

If we can do a screenshare or hangout session it would really help us debug your issue.

Please share us your details at rkh001@cambiumnetworks.com so that I can get in touch with you.


Sent you an e-mail to get that set up.

The radio seems to be getting the message but isn't able to follow through:

Nov 15 21:30:59 hc-jturner-c-30 DEVICE-AGENT[10788]: handle_cns_msg: MSG_REGISTER_SUCCESS received
Nov 16 09:01:28 hc-jturner-c-30 DEVICE-AGENT[10788]: platform_sw_update: ENTRY
Nov 16 09:01:53 hc-jturner-c-30 DEVICE-AGENT[6463]: do_sw_update2: Exited while loop sw_upd state=0
Nov 16 09:25:55 hc-jturner-c-30 DEVICE-AGENT[10788]: platform_sw_update: ENTRY
Nov 16 09:26:20 hc-jturner-c-30 DEVICE-AGENT[14354]: do_sw_update2: Exited while loop sw_upd state=0

An update to the situation. We do not use the default SNMP Community String in our network. So, as a troubleshooting method, we changed the SNMP Read-Only and Read-Write strings to the defaults of public/private as seen in a ePMP 1000 on defaults. Tried to upgrade the same SM as previous and it was successful!

Then moved to another SM on the same tower site but connected to a different AP, also successful. Attempted another tower site with 2 SMs connected, AP and both SMs were able to upgrade after changing the SNMP strings back to default.

I have had to reboot one SM, but otherwise the other devices we have tried to upgrade while using the default strings appear to be upgrading succcessfully. Have successfully updated 4 SMs and 5 APs in this manner.

Change the Read-Only string to public, change the Read-Write string to private. Perform the update. Change the strings back to normal settings. We have been in discussion with Cambium about the issue and hope to reach a resolution that will allow Maestro to work with our existing Comunity String settings.

Hope this helps!

2 Likes

We have been able to reproduce the behavior described above by Miah.  

If the ePMP device is using the default SNMP community string, then the software upgrade completes successfully.  If non-default community strings are set on the device, the the upgrade fails.

The team is working to resovle the issue and will report back quickly.

Thanks for the update Emilio,

I just confirmed on one of my units that this workaround gets the software update installd (and breaks my monitoring in the process).

Jacob, can you clarify what you mean when you state "and breaks my monitoring in the process" ?

Thanks.

The SNMP monitoring that we have setup.  As cnMaestro can't do email alerts still we need to still use our existing SNMP-based monitoring system.


@Jacob Turner wrote:

The SNMP monitoring that we have setup.  As cnMaestro can't do email alerts still we need to still use our existing SNMP-based monitoring system.


Did you re-enter your normal SNMP string after doing the upgrade? We only changed it on our end long enough to perform the upgrade.

Regarding the problem upgrading ePMP devices that have non-default SNMP community strings, we are working on a new ePMP release that will resolve this issue.  We hope to release it within a few days.

Upgrading to that release (using cnMaestro) will require that the devices use deault SNMP community strings (public/private).  Once upgraded to this load, non-default community strings can be used for future upgrades.

ePMP ver 2.5.2-RC9 is available via cnMaestro.

We believe this fixes this issue with non-default SNMP community strings bug please read the details in the following post.

http://community.cambiumnetworks.com/t5/cnMaestro/Please-upgrade-to-ePMP-ver-2-5-2-RC9-via-cnMaestro/td-p/46617

To validate the issue has been resovled for your setup, you can do the following.

1) Set device SNMP community string to default (public/private).

2) Upgrade to 2.5.2-RC9

3) Set device SNMP community string to non-default values.

4) Downgrade to 2.5.2-RC6.

The following should complete successfully and validates the SNMP fix is present in 2.52.-RC9.

Unfortunately, now that the devcice has 2.5.2-RC6, you will need to set the default SNMP community string on the device in upgrade back to 2.5.2-RC9.

We have released official 2.5.2 and it is available via cnMaestro now.

You can find the release notes via cnMaestro.