2.4.3 Randomly locking up?

So here's the situation. We've got a central tower with two ePMP 1000 5.8 masters back to back on the same channel. With GPS sync they work like a charm. 150ish meg out to both locations.

We recieved these radios this summer with 2.3.4 installed, we upgraded to 2.4.3 and put them in the air. They worked fine from July~mid November. In mid November someone hit a telephone pole and snapped it somewhat near the tower. All of the houses in the area suffered a massive power surge(lights got very bright and then popped)and then went out, our tower included. All of our radios are behind battery backups, and a power conditioner.

However, the battery backup that powers the backhauls kicked completely off instead of switching to battery power when this happened. I ran out and got the generator online. The battery backup never came back so we replaced it and all seemed well.

While I was waiting for the power to come back I checked all of our backhauls. All were well except the two ePMP links. Quality and capacity were garbage. After doing more digging I found that they showed their active firmware was 2.3.4 and their inactive bank showed 2.4.3 and they had lost the ability to GPS sync. I tried rebooting with no luck, the came back on still showing 2.3.4 in the active bank. I then re-uploaded the 2.4.3 firmware and finally they were back to normal. All was well for a few weeks.

Now, one of the slave sides (at another tower location miles away) is randomly losing wireless and ethernet access for no apparent reason. The only thing to snap it out of it is to physically unplug it from the PSU and plug it back in. At which point it seems fine until it randomly goes MIA again (This has happened three times in three weeks). The cable has been tested and appears to be fine. We have also tried alternative power supplies to no avail. 

I feel as though I'm grasping at straws and connecting far off dots but I'm at a bit of a loss.

Is 2.4.3 buggy? has anyone else had issues like this?

Hi Zack,

I haven't heard of issues similar to yours with 2.4.3. I'm wondering if the powering equipment on the slave side is cutting out intermittently. When it happens, is the ePMP unit configuration ever reset?

Please email me at alex.marcham@cambiumnetworks.com and we can investigate this further.

Thanks,

Alex

Hi Zack,

When the firmware downgrades (e.g. 2.4.3 -> 2.3.4) without user intervention, this will happen when the active bank fails to complete the boot up procedure after eight consecutive occurrences.  After the eighth failure, the software in the backup bank is automatically switched to the active bank.  The thought behind this is something is corrupted in the active bank and to prevent being stuck forever because boot up always fails, we switchover to the backup bank.

My hypothesis is either the power was flaky for a period of time causing continuous boot up failures or something became corrupted in the active bank (i.e. 2.4.3).

I do not know why you are still having problems with the one device, but I would swap it our myself with a new device to see if the problem remains (that means it is software) or goes away (that means it is hardware and something in the power surge hurt it).  I have not heard of this type of software defect previously.


Daniel Sullivan

ePMP Software Manager

Thanks for the info Daniel. These radios are the only ones we didn't have a backup in stock for. One has been ordered and if it locks up again, it will definitely be swapped out.

However, the radio that is still acting up wasn't on the tower that surged, it's the other end of that link. I'm sure the power surge is unrelated, but I wasn't sure if one side defaulting to older firmware could be a clue to the problem. 

Luckily we have a guy very close to the tower at all times so when it locks up it's no big deal for him to go out and pull power from it. 


How far off does a stable version of 2.5.X look to be?