Anyone else having to reboot AP's since 2.5.1 upgrade ?

I had problems upgrading to 2.5.1 some SM will not longer connect to AP. AP is at 5ms frame.

Downgrading to 2.5 fixes the issue. Some SM which won't connect are on 2.3.1 or..

Is there any incompatibility or regression?

Regards

Antonio

>I wonder if maybe you're using 2.5 ms frames

We haven't messed with the frames  or any other RF settings on the radio beyond the freq, channel width (20Mhz), SSID and WPA2 key.

We have had this problem as well. If I downgrade to 2.3.4 the issue goes away. The problem is I want to run the newer firmware because I have better speed results with it but can't have customers randomly going down. They do come back after a while and rebooting the radio or deregistering the client from the AP will bring them back as well but that's still unacceptable. The simple fact that it doesn't happen on 2.3.4 but it does randomly appear on the firmware versions since tells me it has to be an issue in the firmware. I haven't been able to find any resolution to this problem so if anyone has any ideas please update. Thanks

I'm seeing  an ePMP 1000 backhaul in TDD PTP mode stop passing traffic once a day since going from 2.5 to 2.5.1 but the Slave/SM shows up as connected, I have rebooted the Master/AP and all comes back fine. I just changed to 5 ms frames in hopes it will help. I have 3 other links on that tower that are doing fine.

> If I downgrade to 2.3.4 the issue goes away.

Are you downgrading just the AP or the customer radios also ?  We were running 2.4.3 on many of our AP's (and all but two renegade customer radios) and I didn't have this problem but it is becomming a daily thing now so I'm going to have to either roll back or try the newest RC

Just the APs, it doesn't seem to matter if the client radios are on a higher firmware or not. Every new firmware version I upgrade my APs in hopes that the problem goes away but it reappears and rolling back to 2.4.2 it stops happening for us. 

I've had some problems on the latest firmware with PPPOE. I lose connection to the radius server and it doesn't come back without a reboot. Rolled back to an earlier firmware and it seems better.

We also  noticed this problem on 2.5 version. But our APs were working  fine for 2 months! 

We have downgarded all  all  APs to 2.4.3 but the problem is still here

We have had to downgrade most of our AP's to 2.4.2.

Problem persisted on some sites on 2.4.3.

We do have one sector still on 2.5.1 that doesn't seem to have this problem - every other one we've had to downgrade.

What FW are your CPEs running ?

A mix of 2.5 and 2.5.1 with a few stragglers on 2.4.3.

Issue only seems to affect the AP's.

I used 2.4.3 rather for a long time and here was no such problem. Then upgraded Aps to 2.5 and also it worked fine  for defenete period of time.  But suddenly without making any changes this problem appeared.... 

In 2.5.2 this problem still exists

We downgraded all aur APs to 2.3.4 FW and  it is working stable but lost in througput

So many dissapointmets...

Anyone know if this is fixed in 2.6 ?   I appears Cambium has acknolwedged this is a know issue in the forums but I don't see this listed on the "know problems or limitations" for any of the 2.5 - 2.6 versions.  



All, 

If you are having random reboots on your radios, please drop me a note at sriram@cambiumnetworks.com. I will work with you. 

In general, there are several fixes in Release 2.6 for stability issues seen in Release 2.5. If you are still seeing random reboots in Release 2.6, please dump the crashlog at the radio's cli by typing "debug crashlog" and email it to me. 

Thanks,

Sriram

"If you are having random reboots on your radios, please drop me a note at sriram@cambiumnetworks.com. I will work with you. "

What ? This has nothing to do with random reboots.

This is about SM's that stop passing traffic (can not ping / ssh / http into the raios and the customer reports they have no internet) while showing they are connected to the AP. You have to boot the SM off the AP , at first we were rebooting the APs instead of just kicking the affected customer radios,  and it starts working when it reconnects.

This started with 2.5 or 2.5.1 , there are serveral threads about it here, everyone affected has had to roll back the APs to 2.4.3  while leaving the SMs at 2.5 or 2.5.1 .  


@brubble1 wrote:

"If you are having random reboots on your radios, please drop me a note at sriram@cambiumnetworks.com. I will work with you. "

What ? This has nothing to do with random reboots.

This is about SM's that stop passing traffic (can not ping / ssh / http into the raios and the customer reports they have no internet) while showing they are connected to the AP. You have to boot the SM off the AP , at first we were rebooting the APs instead of just kicking the affected customer radios,  and it starts working when it reconnects.

This started with 2.5 or 2.5.1 , there are serveral threads about it here, everyone affected has had to roll back the APs to 2.4.3  while leaving the SMs at 2.5 or 2.5.1 .  


I apologize for the confusion. There are indeed many threads on this and I have been working with a couple of other customers reporting the same issue you are seeing (stop passing traffic, cannot ping/ssh/http) but for them the SM eventually crashes and reboots. 

It appears that your issue is slightly different. It would be great to collect some logs from your SMs that are exibhiting this problem. Would remote access to the SM be possible? Are you seeing this on 2.6?

Thanks,
Sriram

 " Are you seeing this on 2.6?"

Sigh... I don't know, that is what I was asking... I don't want to update to 2.6 unless this problem has been fixed.

"for them the SM eventually crashes and reboots. 

 It appears that your issue is slightly different."

While none of the others that posted in this thread wth this exact same problem said anything about the SM's rebooting there was one guy that said if the SM was left long enough it would eventually disconnect on its own and start passing traffic when it reconnected. Now possibly when he was seeing the SM disconnect on its own it was rebooting , I don't know.  Mine do not reboot randomly.

I have an SM that had this problem (the SM is/was v2.5.1) and as of right now that SM shows an uptime of 88 days 22 hours  so it has not randomly or otherwise rebooted since October 11th. We also track reboots via snmp and Cacti and it also shows the last reboot of the radio was October 11th.  This  SM had this problem  3 times that I know of. Once on Oct 23rd ( I rebooted the AP which was also running 2.5.1 . I did not reboot the SM), again on Nov 8th ( again I rebooted the AP and not the SM) and again on Nov 21st ( I kicked the SM off the AP via the Monitor/Wireless /disconnect button).  Later that night I reverted the AP to 2.3.4 (not the SMs ,they remain on 2.5.1) and the problem never occured again with this SM or any other SMs on that AP.  

The sys logs on this SM go back to Nov 21st , the day of  the last time it had this problem : 

Nov 21 18:39:57 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:40:12 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:40:42 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:40:59 tickfarm pppd[5481]: Unable to complete PPPoE Discovery 

Nov 21 18:41:29 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:41:44 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:42:14 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:42:29 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:42:59 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:43:14 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:43:44 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:43:59 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:44:29 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:44:44 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:45:15 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:45:32 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:46:02 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:46:17 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:46:47 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:47:02 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:47:32 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:47:47 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:48:17 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:48:32 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:49:02 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.  

Nov 21 18:49:18 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

-------- (It was around this time the customer called to report his internet was not working)

Nov 21 18:49:48 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:50:03 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:50:33 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:50:48 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:51:18 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:51:33 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

Nov 21 18:52:03 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 21 18:52:18 tickfarm pppd[5481]: Unable to complete PPPoE Discovery

------  (this is aproximatly the time I kicked his radio off the AP and it reconnected)

Nov 21 18:52:48 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 23 05:17:06 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

Nov 24 16:07:37 tickfarm pppd[5481]: Interface ath0 has MTU of 1492.

NOTE :  While we use PPPoE to authenticate we use a static 10.10 IP for the management interface. We can ping/access the radio  regardless of PPPoE working or not.  In this case the radio stopped passing traffic so we could not reach the managment interface or ping it and the radio could not auth via PPPoE.

Have the same problem also with 2.6 on ePMP Force 110 PTP. AP is running in ePTP Master mode. With 40Mhz channel it is absolutely unusable, with 20Mhz channel it is working better but also crashing about once in a week. In TDD PTP it's working properly but with the higher latency and lower throughput. It would be great to implement watchdog on the AP or SA side as the workaround. Because manual reboot/reconnect is total unacceptable to meet the SLA.

Thanks