ePMP1000 dropping all subs

Hi there,

We have an issue with a particular tower where 2/3 ePMP1k APs are periodically dropping all subscribers, requiring a reboot for them to reconnect.

This has been happening for about 2 weeks now and it seems to take anywhere from 7 to 48 hours for it to repeat.
This tower has a mix of Force 200 and ePMP1000 SMs that (apart from a couple radios) were all running 3.5.1 before this problem started.
We hadn't made any changes to firmware or equipment prior to this issue appearing.

Could this be environmental? 
This is 2.4GHz gear, so could there potentially be a bunch of interference causing these APs to drop all subs?  
eDetect doesn't really show anything but I'm not too familiar with troubleshooting ePMP gear.

Any ideas?


Update everything to 3.5.6... this is the best firmware for e1k/e2k. Then, assuming your using GPS, set your hold off timer to something like 600 seconds (10min) or more and watch for GPS drops in the logs. We've had a lot of issues with GPS sync drops causing everyone to drop.

With 2.4GHz it's going to be very likely that you'll need to use a 10MHz or 5MHz channel width. I don't think we've found any sites where 2.4 is clean enough to run a 20MHz channel. Do you collocate other ePMP 2.4 AP's at a site? Make sure they're all using GPS sync and that their settings are correct, and that you're using at least a 5MHz channel width guard band between adjacent AP's.

3 Likes

This is a known issue were we experience sporadic rejects in 3.5.6 and this was fixed in 4.3.2.1 but still there is some issue.

This will be fixed in the upcoming software release (4.4).

best regards

We have been seeing some unusual messages in syslog.
=================
<134>1 2019-08-02T07:08:26-08:00 109-70N DEVICE-AGENT 13211 - [meta sequenceId="54126"] get_code: buf INTEGER: 2

=================

<134>1 2019-08-02T07:08:22-08:00 109-70N DEVICE-AGENT 2627 - [meta sequenceId="54113"] event_rx_cb:MSG TYPE = 1

=================

<133>1 2019-08-02T07:08:18-08:00 109-70N DEVICE-AGENT 2627 - [meta sequenceId="54104"] Trap data received: name STA_REJECT timestamp 1564758498 mac 00:04:56:FD:4A:BF status 0 msg [COMMUNICATION LOST]

=================

1 Like

We skipped past 3.5.6 straight to 4.3.2.1 before looking to see what was recommended so we're trying that, but the issue I keep running into is cnMaestro saying "The software image is not available to download!" and skipping every SM in the job.  This is after reading another post where I think you recommended downgrading the SMs first and THEN the AP.  Any ideas whats causing that?  I'll update each SM individually if I have to but I'd like to avoid it.

Our channel separation is ok on paper but we are running 20MHz channels and this is still 2.4GHz.
I don't know that we'll be able to do without 20MHz as we've been operating at that channel width since day dot.


We had some variance in the GPS hold-off settings between the 3 APs but they were all in excess of 600s.
Changed them all to have the same 600s hold-off to see if that helps.

1 Like

Ok, as of now all APs and SMs on this tower are running 3.5.6, have a 600s GPS hold-off, and are running 20MHz channels with appropriate separation.

One other thing I noticed after this issue began occurring is that we'll briefly lose connection with the AP while in the interface.  I'm talking about 5-10s and then it comes good again.  Also 7/10 times after I log into an AP I need to refresh the page before I can even see the menu.
Is this indicative of anything?

We'll have to wait and see if this firmware stabilises things for us but given that this issue popped up without us making any changes, I'm not hopeful.

If this is due to intermittent interference, do you have any suggestions on how to catch it?  Can't reasonably sit here running the eDetect scan on 2 APs for several hours.


@jakehoff wrote:


If this is due to intermittent interference, do you have any suggestions on how to catch it?  Can't reasonably sit here running the eDetect scan on 2 APs for several hours.


eDetect isn't super useful. It only shows interferers on the same channel and channel width... e.g. if you're using a 20MHz channel, it's not going to show interferers using 10 or 5Mhz channels. Additionally, if someone is using a 20MHz channel slightly overlapping or just outside of your channel, it won't show this either. It's only useful to see 20MHz interferers directly on top of your channel.

When trying to avoid interference or find a good channel for an AP, we've found it most useful to temporarily use ACS at the longest scan intervals and then after it's done, take a screen shot of this, and then turn off ACS and manually set the frequency to whatever looks clean on the ACS scan.

Lastly, on the AP you can set your "Management Traffic Rate" to MCS 0, which should deal with interference a bit better at the expense of a little throughput.

3 Likes

Oh wow, I didn't know eDetect worked so specifically...

Looks like we're already running Management Traffic MCS at 0 so we'll try that ACS method if the issue persists which I suspect it will.

Thanks for your help so far

1 Like