Anyone else having to reboot AP's since 2.5.1 upgrade ?

A mix of 2.5 and 2.5.1 with a few stragglers on 2.4.3.

Issue only seems to affect the AP's.

I used 2.4.3 rather for a long time and here was no such problem. Then upgraded Aps to 2.5 and also it worked fine  for defenete period of time.  But suddenly without making any changes this problem appeared.... 

In 2.5.2 this problem still exists

We downgraded all aur APs to 2.3.4 FW and  it is working stable but lost in througput

So many dissapointmets...

Anyone know if this is fixed in 2.6 ?   I appears Cambium has acknolwedged this is a know issue in the forums but I don't see this listed on the "know problems or limitations" for any of the 2.5 - 2.6 versions.  



All, 

If you are having random reboots on your radios, please drop me a note at sriram@cambiumnetworks.com. I will work with you. 

In general, there are several fixes in Release 2.6 for stability issues seen in Release 2.5. If you are still seeing random reboots in Release 2.6, please dump the crashlog at the radio's cli by typing "debug crashlog" and email it to me. 

Thanks,

Sriram

"If you are having random reboots on your radios, please drop me a note at sriram@cambiumnetworks.com. I will work with you. "

What ? This has nothing to do with random reboots.

This is about SM's that stop passing traffic (can not ping / ssh / http into the raios and the customer reports they have no internet) while showing they are connected to the AP. You have to boot the SM off the AP , at first we were rebooting the APs instead of just kicking the affected customer radios,  and it starts working when it reconnects.

This started with 2.5 or 2.5.1 , there are serveral threads about it here, everyone affected has had to roll back the APs to 2.4.3  while leaving the SMs at 2.5 or 2.5.1 .  


@brubble1 wrote:

"If you are having random reboots on your radios, please drop me a note at sriram@cambiumnetworks.com. I will work with you. "

What ? This has nothing to do with random reboots.

This is about SM's that stop passing traffic (can not ping / ssh / http into the raios and the customer reports they have no internet) while showing they are connected to the AP. You have to boot the SM off the AP , at first we were rebooting the APs instead of just kicking the affected customer radios,  and it starts working when it reconnects.

This started with 2.5 or 2.5.1 , there are serveral threads about it here, everyone affected has had to roll back the APs to 2.4.3  while leaving the SMs at 2.5 or 2.5.1 .  


I apologize for the confusion. There are indeed many threads on this and I have been working with a couple of other customers reporting the same issue you are seeing (stop passing traffic, cannot ping/ssh/http) but for them the SM eventually crashes and reboots. 

It appears that your issue is slightly different. It would be great to collect some logs from your SMs that are exibhiting this problem. Would remote access to the SM be possible? Are you seeing this on 2.6?

Thanks,
Sriram

 " Are you seeing this on 2.6?"

Sigh... I don't know, that is what I was asking... I don't want to update to 2.6 unless this problem has been fixed.

"for them the SM eventually crashes and reboots. 

 It appears that your issue is slightly different."

While none of the others that posted in this thread wth this exact same problem said anything about the SM's rebooting there was one guy that said if the SM was left long enough it would eventually disconnect on its own and start passing traffic when it reconnected. Now possibly when he was seeing the SM disconnect on its own it was rebooting , I don't know.  Mine do not reboot randomly.

I have an SM that had this problem (the SM is/was v2.5.1) and as of right now that SM shows an uptime of 88 days 22 hours  so it has not randomly or otherwise rebooted since October 11th. We also track reboots via snmp and Cacti and it also shows the last reboot of the radio was October 11th.  This  SM had this problem  3 times that I know of. Once on Oct 23rd ( I rebooted the AP which was also running 2.5.1 . I did not reboot the SM), again on Nov 8th ( again I rebooted the AP and not the SM) and again on Nov 21st ( I kicked the SM off the AP via the Monitor/Wireless /disconnect button).  Later that night I reverted the AP to 2.3.4 (not the SMs ,they remain on 2.5.1) and the problem never occured again with this SM or any other SMs on that AP.  

The sys logs on this SM go back to Nov 21st , the day of  the last time it had this problem : 

Nov 21 18:39:57 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:40:12 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:40:42 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:40:59 tickfarm pppd[5481]: Unable to complete PPPoE Discovery 

Nov 21 18:41:29 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:41:44 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:42:14 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:42:29 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:42:59 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:43:14 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:43:44 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:43:59 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:44:29 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:44:44 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:45:15 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:45:32 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:46:02 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:46:17 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:46:47 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:47:02 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:47:32 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:47:47 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:48:17 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:48:32 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:49:02 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.  

Nov 21 18:49:18 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

-------- (It was around this time the customer called to report his internet was not working)

Nov 21 18:49:48 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:50:03 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:50:33 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:50:48 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:51:18 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:51:33 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:52:03 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:52:18 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

------  (this is aproximatly the time I kicked his radio off the AP and it reconnected)

Nov 21 18:52:48 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 23 05:17:06 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 24 16:07:37 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

NOTE :  While we use PPPoE to authenticate we use a static 10.10 IP for the management interface. We can ping/access the radio  regardless of PPPoE working or not.  In this case the radio stopped passing traffic so we could not reach the managment interface or ping it and the radio could not auth via PPPoE.

Have the same problem also with 2.6 on ePMP Force 110 PTP. AP is running in ePTP Master mode. With 40Mhz channel it is absolutely unusable, with 20Mhz channel it is working better but also crashing about once in a week. In TDD PTP it's working properly but with the higher latency and lower throughput. It would be great to implement watchdog on the AP or SA side as the workaround. Because manual reboot/reconnect is total unacceptable to meet the SLA.

Thanks


@slava wrote:

Have the same problem also with 2.6 on ePMP Force 110 PTP. AP is running in ePTP Master mode. With 40Mhz channel it is absolutely unusable, with 20Mhz channel it is working better but also crashing about once in a week. In TDD PTP it's working properly but with the higher latency and lower throughput. It would be great to implement watchdog on the AP or SA side as the workaround. Because manual reboot/reconnect is total unacceptable to meet the SLA.

Thanks


Hi, 

Sorry for the troubles. If your radio is crashing in 2.6, please ssh into CLI and type "debug crashlogs" and send it to me via email at sriram@cambiumnetworks.com. 

Regarding 40 MHz channel being absolutely unusable compared to 20 MHz, are you saying the radio crashes more often with 40 MHz or were you generalizing?

Thanks,

Sriram

We are in a very cold weather climate and we are experiencing rebooting issues, only on access points running 2.5 and above. It is still happening on 2.6 fw for us as well.

HI DSMW16,

Please contact me at daniel.sullivan@cambiumnetworks.com.


Daniel Sullivan

ePMP Software Manager

I can confirm about 4-5 cases with the same sympthoms - operation mode ePMP Master/Slave, STA registered to AP, but no access to the STA and no passing data. Rebooting the unit - back to operation. What's next?