Recovering the Onboard E2E Network from a Failed/Unrecoverable PoP Node

This article explains how we can recover the network from an available backup when the PoP Node crashed/failed/unrecoverable.

NOTE: This feature is applicable only if the available backup was taken from a network with a PoP Node running 1.2.2 Software image version or higher.

What is not recommended?

“Replace Node” cnMaestro function that is shown in the screenshot below should NOT be used for this type of use-case

Note: Although, since an Onboard E2E can support Multi-POP, the “Replace Node” function should be available for a POP that is NOT hosting an Onboard E2E

What is needed?

  1. The network should have been connected to cnMaestro[Might be offline currently as the PoP node connected to cnMaestro crashed or failed]
  2. We should have an available backup of the same network which was taken from cnMaestro when the Network was online.
  3. A spare radio node (of exact same Model … V5000, V3000, etc.) as failed PoP node and that should not be an alternate node from the same network topology of failed node.
  4. The spare node must be running 1.2.2 image or higher as the crashed PoP node or as needed for the network configuration.

How long should this take?

Approximately 5 to 10 mins after the backup has been successfully restored.

Begin with complete replacement steps:

  1. It is required that a current E2E controller backup file is available from the original/failed POP node which hosted the Onboard E2E controller. A cnWave Operator should make it a common operational practice to create a new E2E Controller backup file at a reasonable cadence, or any time a topology change is made to their cnWave network.

  2. Once it is determined that the original POP node that was hosting an Onboard E2E controller has suffered a HW failure and needs to be replaced, a spare radio node (of exact same Model … V5000, V3000, etc.) needs to be obtained by the local installers and the MAC address and Serial number noted. Local installer physically replaces failed node and attaches Ethernet/PoE cable(s) to replacement node.

  3. If original POP was configured with a management VLAN, then you will temporarily need to change the directly connected Switchport VLAN configuration to allow for remote access once the replacement node is installed. This is required due to the default config of the replacement node not having any Management VLAN configured, and therefore it will only be remotely accessible via untagged IPv6 address ([fe80::204:56ff:fexx:xxxx]) or untagged IPv4 address (169.254.1.1).

  4. Remotely access replacement node via its web GUI. Enabled the Onboard E2E controller and fill out ONLY the cnMaestro information in the bottom part of the pop-up window.

  5. Once the E2E controller is enabled, go to the Configuration/Node settings and configure the replacement node’s IPv4 address using the exact same IP settings that were being used by the original failed radio node. Also configure the DNS Server information in the Configuration/Network tab.

  6. Check the cnMaestro Connection status and ensure it is now showing “Waiting for Approval” status. You may need to toggle the “Remote Management” Enable/Disable button on the replacement radio node to restart/speed up the cnMaestro connection process.

  7. DO NOT APPROVE/ONBOARD THE NEWLY DETECDTED REPLACEMENT E2E NETWORK INTO CNMAESTRO UNTIL AFTER PERFORMING THE FOLLOWING STEPS!! Failure to follow this will result in a failed E2E restoration.

  8. Delete the original E2E network from cnMaestro (1st step is to click on the failed E2E network, then Inventory, then delete all the listed cnWave nodes …. 2nd step is to delete the failed E2E Network itself)

  9. Onboard the new (replacement) E2E into cnMaestro. Configure the name of the new E2E Network the same as the original/failed E2E Network. (The original E2E Network name is not stored as part of the E2E backup configuration file).

  10. Restore the E2E backup configuration file from the original/failed node onto the newly onboarded replacement E2E Network (the POP node will take the configuration and reboot when you do this)

  11. While new POP node is rebooting, restore switch VLAN interface to its original configuration (since the POP node will now have a configured Management VLAN once it reboots with the restored E2E configuration file)

  12. The full cnWave network should now be operational and visible via cnMaestro as it was before the HW failure occurred.

5 Likes

Hello,

I just replaced our V5000 E2E following this guide. Overall, everything went pretty well… only two issues were a bit annoying:

  1. In the replacement steps, a point should be added to check the firmware of the replacement unit and update it if necessary.
  2. There was a problem with onboarding the V5000 in cnMaestro because the V5000 was shipped with a date in 2021. This resulted in a failed adoption because of Cert Failure, which could only be identified by carefully reading the log files. The error message was: {“file”:“conn.go:104”,“func”:“agent.(Agent).routerConnect”,“level”:“error”,“msg”:“Router connect error: Get “https://cloud.cambiumnetworks.com/cns-onboarding/********”: x509: certificate has expired or is not yet valid: current time 2021-05-05T12:35:17Z is before 2022-08-11T00:00:00Z”,“name”:“agent”,“time”:“2021-05-05T12:35:17Z”}.
1 Like

Hi @Andreas_Schnederle-W

This document already has the firmware update recommendation in “What is needed” section even before starting into the steps.

Also appreciate your comments and will reframe the steps as needed.

1 Like