Since my first ePMP3000 installation at the beginning of 2019 until today, every ePMP3000 and 3000L will eventually crash/self reboot. I have never seen a 3000 with a uptime of over 100 days. Have you guys?
I have seen a 3000 in the 90’s but never triple digits. Don’t get me wrong, I love these two radios very much, but recently one 3000L has been self rebooting weekly, and two of my 3000’s both crashed/self rebooted a few weeks ago, prompting me to look into this more.
It is not the switches at fault, and in the switch logs I just see reports of the eth ports going down then coming back up again.
I do run most of my radios on 80mhz channels as there’s no noise where my WISP is, and performance is amazing. I want this to stop happening though as i’m outside warranty now and I just assumed firmware would eventually fix it - back in 2019 self reboots/crashes were very common, so as they became less frequent with firmware updates, I assumed they would go away altogether, but they have not.
Can we see some screenshots of status pages with an uptime of more than 100 days on epmp3000 running 80mhz channel? Even 40mhz channel would be good to see.
This is a tough one as I have worked with many operators having great success with ePMP, on the flip side there are those few that seem to be plagued with issues, bad luck, cursed if you will (we are getting close to halloween). Sometimes it seems like it’s a perfect storm of faulty equipment, challenging environmental conditions, power or grounding issues, and network and configuration issues that all result in a seemingly never ending poor experience. I’ve been using Cambium equipment for a very long time now and on the whole, it’s been very reliable, but I’m pretty strict when it comes to how it’s deployed and its configuration. My core philosophy is to have the AP do as little as possible outside of pushing RF packets around.
All that being said, I can’t give you a great report as to AP uptime, because a large part of my success has been keeping extremely up to date with firmware. I checked and the highest uptime for an e3k AP is 68 days, but the only reason is because we applied 4.7.1-RC13 recently. We have some e3kL AP’s, but again, we just recently hung them and then applied 4.7.1-RC13, so those are at 50+ days. We do have some F300 AP’s or PtP’s running 4.7.01 that have been up for 150+ days. While our uptimes are not high, I can say with confidence that we rarely if ever have any random reboot issues across 50 e1k and e3k AP’s and nearly 500 clients.
Back to my philosophy of having the AP do as little as possible and only RF.
All the AP’s use private IP’s and are fully firewalled from the internet. Each site is L3 routed. We do not use VLAN’s on the AP or SM’s. It’s extremely important to minimize or filter any strange traffic going over your BH’s and/or into your AP’s.
We do not use PPPoE
We do not use IPv6
We use our own in house NTP and caching DNS servers
We enable and use SNMP for monitoring
We use cnMaestro cloud
We do use QoS and MIR on the AP’s, but we use Cambium QoE for MIR control
All other services are turned off or disabled on the AP
We use Cambium sync via ethernet wherever possible
We use sync and at most sites we use a TDD fixed ratio of 75/25
At many sites we use super high end pre-terminated, certified cat 6 cables and ends
We use gig-e capable Transtector inline ethernet surge protectors
All SM’s are typically using the most current stable firmware revision, in this case, 4.7.0.1
The goal is to create an environment with as little spurious traffic, electrical or sync issues as possible. Making things as clean and simple as possible is my key to success.
Wow, i figured it wasn’t just me. I have a 3000L on over 100 days running 80mhz channel, but all my 3000’s crash eventually on 80mhz.
Does anyone run 80mhz on 3k other than me? I know i’m lucky not to have noise, but to be honest, the one AP that does face noise runs mint on 80mhz, you just have to tune each sm max mcs in both directions to keep retransmits at zero.
Anyone with a 3k running for more than 100 days on 80mhz?
16 subs and 4.6.2. It’s been updated to 4.7.1-RC13 now though. I have a few radios running 4.7.1-RC13 with 35~40 subs that have been up 70+ days at this point without any issue.
Generally been OK for us. Firmware is 4.6.1 RC27 I believe in updating only if there are clear benefits or security issues AND if others are happy with stability. Our customers will not put up with outages, every time we have one on an AP, we would lose at least 1 customer. Not that we are unreliable, but even a couple of hours once a year is perceived badly. Never mind that if they have fibre,(fiber) that often takes days to repair!
That’s good to know. I’m testing 4.7.1rc13 on a few eptp links and they have not dropped yet. I’ve not tried this fw on any 3000’s as the gui is all messed up for me and won’t load at all on Chrome. Have you had any gui issues on 4.7.1rc13? some menus don’t show at all, like max mcs setting for uplink on the SM.
I feel like the ePMP firmware and the F400ax firmware is about to become properly stable @ 80mhz.
This got me checking mine. My longest 3K uptime is 26 days. My 2K and 1K APs are all when we last downgraded firmware (to 4.6.1) over 200 days ago.
So, yes, apparently our 6 3K APs all reboot very frequently and my monitoring software is not catching all of those since they seem to be pretty quick.
We do not have the AP doing anything. It is a transparent bridge. Our routers handle all shaping and public IPs are handed off to customers via a data VLAN.
Every AP and SM in our system is at 4.6.1 firmware and will stay there until we replace with 6 GHz gear.
… and this is what i’m talking about. I’m also using as transparent bridge, with no shaping done at radio, and using data vlan feature. I wonder if it’s the data vlan feature that leads to the crashes on the 3k. I have a 3KL with 122 days uptime but its the 3K that’s the most crashy.
Ive not tried 4.7.0.1 yet, but 4.7.1rc13 is looking awesome for eptp links. The gui is messed up though, but command line always works. I’d choose satbility over gui any day.
I also don’t see any alerts as the reboot is quick enough not to trigger cn maestro emails.
The AP shouldn’t care about data vlans. It doesn’t even know about them, that is a CPE issue. Our APs have nothing turned on other than Option 82. Everything else is basically default.
We tried 4.7 and it was a giant disaster for us. We lost a dozen customers as a result of the issues in 4.7.0, had APs lock up, SMs factory reset - dozens of truck rolls all due to firmware. We decided to roll back and stay there. Fool me once… I don’t see enough improvements to even think about trying that again. Customer complaints stopped the moment we downgraded.
ouch man that is rough, i’m sorry to hear you went through that - I remember being so stressed out of my mind in 2019 due to early ac cambium firmware. I was terrified of getting a poor reputation due to link drops. Luckily I got through it by scheduling reboots every morning at 4am, as the drops usually happened after a few days, so by doing this I beat them to it and managed to keep my reputation - I have lost one customer in 4 years of operation, and they went to starlink as they got it for their campervan.
I still run twice weekly reboots of my [last standing] ptp550 and the new F400AX gear for this very reason. Choosing to reboot at 4am is far superior to having a crash at 6pm. I have to say though that 5.6 looks to have sorted the AX gear.
I only had problems across all ePMP with 4.6.1 software. Have been running 4.6.2 for over a year now without issue. I’m currently trying 4.7.0.1 for new installs to get a feel for its stability before going across the board with the update.
Thanks for the screenshot, Sirgin. I have some 3000L with good uptimes too. It’s the 3000 that seems to crach/reboot a lot. You have any of those with uptimes over 100 days?
In my opinion, I would skip this and go directly to 4.7.1 - RC18. I know I normally wouldn’t put an RC in production, but 4.7.0.1 has a number of bugs that 4.7.1 resolved. So I would suggest either stay with 4.6.2, or run 4.7.1 IMHO. (of course YMMV)