A cnWave network is managed by a central controller named ‘E2E Controller’, and each cnWave node runs a lightweight client named ‘minion’ that connects to the central controller. The controller can run on any host with a route to the cnWave network, including a cnWave node itself. When the controller runs on a cnWave PoP node, its referred as ‘Onboard E2E controller’. Otherwise its referred as ‘External Controller’.
E2E Controller handles important management functions such as link bring-up, software upgrades, configuration management, and more.
The primary features of E2E are listed below. In general, each logical management feature resides in its software module, which is referred to as an “app”. Both the controller and minion share this design.
Topology Management
TopologyApp holds and manages the network topology, a structure containing all details about nodes and links within the network. The app performs thorough validation of all requested topology changes (like add, edit, delete network elements), allocates prefixes for nodes, and contains several algorithms to automatically assign node and link parameters such as polarity, Golay codes, channel, and control superframes.
In addition, TopologyApp records dynamic topology properties, such as node and link liveness and nodes’ routing adjacencies. Liveness is determined from the presence or absence of periodic status reports from nodes, handled by StatusApp on both the controller and minion.
Network Ignition
On the controller, IgnitionApp is responsible for bringing up (or “igniting”) links in the network. Ignition involves forming a link from an “initiator” node, which is already connected to the controller, to a “responder” node. Under the default “auto-ignition” configuration, the app will automatically ignite links during network startup and whenever nodes or links subsequently fail. It applies an algorithm on the current topology state to determine ignition order; multiple links can be ignited in parallel.
Software Upgrade
The controller manages in-band software upgrades through UpgradeApp. Upgrades consist of two phases: “prepare” and “commit”. In the “prepare” phase, the controller distributes the new software image to nodes; upon completion, the nodes will flash the new image onto a disk partition. The “commit” command simply instructs nodes to reboot to the newly-written partition.
The main complication for in-band upgrades is that node reboots will bring down all links to and from a node, which can affect reachability to the rest of the network. The controller’s UpgradeApp includes a scheduling algorithm that parallelizes commits (in “batches” of nodes) while minimizing network isolation, along with a retry mechanism to handle failures during any upgrade step. The minion’s UpgradeApp is responsible for obtaining, validating, and flashing the new software images, and reporting the node’s current upgrade status to the controller.
Configuration Management
cnWave utilizes a centralized node configuration manager and a layered configuration model. Initially, nodes start with a version-dependent “base configuration”, which holds all default config values based on the node’s software version; this is static and bundled with the software image. The “network-wide overrides” layer is applied above the base configuration, and contains any config values that should be overridden uniformly across the network. The topmost layer, the “node-specific overrides”, applies to individual nodes.
To keep config in sync, nodes send a hash of their local config to the controller in their periodic status reports, and the controller will overwrite a node’s config upon receiving a mismatch (unless the config is marked as “unmanaged”).
Scans
ScanApp is responsible for initiating scans on nodes and collecting the measurement results. There are several scan types. For instance, “Periodic Beamforming” (PBF) scans identify independent RF paths between pairs of nodes; these scans are uni-directional, and run between nodes with L1/L2 connectivity. “Interference Measurement” (IM) scans measure interference between links, and involve a single transmitter and multiple receivers.
Scans are scheduled by the controller to run periodically and in parallel, using a graph coloring algorithm in ScanScheduler and a slot scheduling mechanism in SchedulerApp.
Prefix Allocation
cnWave nodes can be allocated IPv6 prefixes in two different ways. Centralized prefix allocation (CPA) is a scheme where the controller allocates prefixes to all the nodes. This scheme linearly scans through the prefix range and assigns unallocated prefixes to nodes. CPA serves mostly as a stepping stone for more advanced allocation schemes such as Deterministic prefix allocation (DPA). DPA involves segmenting the network into prefix zones, which are assigned subnet prefixes of the network seed prefix. Nodes will be allocated prefixes from their zone’s prefixes, allowing the POPs to advertise these subnets to their BGP peers and load-balance ingress traffic.
External E2E vs onboard E2E
When controller runs on one of the POP nodes in the mesh, its referred as onboard E2E. If E2E runs external to mesh, typically in a docker, its referred as External E2E.
A frequently asked question is, which E2E to choose.
External E2E has following advantages
- Supports up to 500 nodes in a mesh. Onboard supports a maximum of 31* nodes including POP node.
- Upcoming Network Analyzer and Network Optimizer (NANO) features require external E2E.
- POP is an outdoor device. Docker hosting E2E can be at data center. Physical reliability is higher.
- In Multi POP deployments, network will remain manageable if any POP goes down. Where as in the case of onboard controller, if the POP hosting E2E goes down, the network will no longer be manageable.
Onboard controller has these advantages
- There is no need of another machine to host E2E. For a PTP network (say V3K to V3K) or a small network, this is a good choice.
- No need to worry about IPv6 routing aspects between E2E and mesh nodes. With external E2E, its necessary to make sure IPv6 routes are properly set, so that E2E can communicate to all nodes.
*From 1.2.1: 31 nodes is the limit for the internal E2E, before that it was 21.