SNMP Response Times on 3.2.2

Larry_Weidig · February 2, 2017, 2:06pm

We are automating a lot of the items regarding these radios for deployment in our network. One thing we have issues with is the SNMP daemon providing responses. For example if we are upgrading a radio we see the following pattern during boot:

Upgrade completes, radio remains responsive for 3 seconds.
Radio does not respond to ICMP / SNMP for approximately 45s.
It then responds to ICMP (NO SNMP) for about 10s, followed by an 8s pause - my guess ethernet negotiation or something else taking down the interface.
After this ICMP is responsive 100%, but not until 120s after reboot and 75s after initial ICMP response does the SNMP daemon on the radio respond.

We have mapped similar patterns when simply even applying changes to the device via SNMP and then applying them. Though times are much shorter. This also seems to generate the error:

Error in packet
Reason: (noSuchName) There is no such variable name in this MIB.

Though a few seconds later polling the same OID it responds just fine. Assuming this is hitting the daemon during a reload of some sort.

We could really use an SNMP daemon on the radios that becomes active nearly simultaneously with ICMP responsiveness in all cases. We would even prefer it take longer to respond to ICMP if that is needed to get this working.

Fedor · February 3, 2017, 10:01am

Hi Larry,

Actually you have described ePMP boot up process when different sub-systems are loaded in series.

And Ethernet driver is loaded before system is ready to response SNMP requests.

For now we don't have plans to sync this two modules.

I'd like to propose you to use time-out timers in your automation tool after reboot command was sent to ePMP device.

Please note different devices types have different boot-up time, especially it applies to Elevate devices where boot-up time is much longer.

Thank you.

Larry_Weidig · February 3, 2017, 1:20pm

Yes, that is what we have coded around. It not only has to be pingable but also respond to an SNMP query to be considered "alive" by our tools. Thanks!

Douglas_Generous · February 6, 2017, 11:05pm

is there a way of having the epmp units (elevate or real) send an snmp message telling the backend system that its alive and ready to handle packets? this would mean that we could code for a specific response. This also eliminates the boot time coding and having to R&D the boot time on each device after each software update to ensure our timeouts are still valid. This also further eliminates the effective difference of using elevate devices as they can also send the same message once they are ready to handle packets again.

I realize traditionally that snmp is a polling protocol but there are provisions for this type of active response system.

I know snmp messages are small but if you have a 1000 devices to poll and you have to continuously poll one device for so much time with no response to decide if its dead or alive, this can add up to a lot of bandwidth used for management and wasted resources. This gets worse with more devices and I dont want to hear "nobody has that many devices on a network".

Fedor · February 7, 2017, 10:43am

Thank you for your idea.

We will discuss it and add it to features short-list for next releases.

Thank you.

Larry_Weidig · February 7, 2017, 2:10pm

What you are describing would be an SNMP trap and devices can typically send them out at restart. However, that typically is directed to your NMS and would not help in the scenarios we are facing. You would somehow have to first tell the unit what IP to send this to and then wait for it.

We have 1000's of devices, but in our case you would typically only be waiting for a few to come back online. The network impact is minimal as we do most of the waiting with NO network activity. The way we have it coded works for ALL units regardless of the timing. It basically is as follows:

Send SNMP command to reboot

Wait 45s

Test for ICMP

If good (we use 3 packets sent / 3 received as "up")

Wait 20s (gets us past Ethernet reset)

Poll SNMP GET for sysDescr

If valid response -> Alive

Wait 5s

Loop Poll SNMP

Wiat 5s

Loop Test ICMP

So during the ICMP phase you are talking up to 6 packets every 5s and then during the SNMP phase up to 2 packats every 5s. Not significant impact on network.

Now of course if you needed to do this for all 1000's of devices at once that is a deifferent story, but at that point I suspect you have more pressing problems to be working on - like why is my entire network down :)