PPPoE stops working since moving from 3.2.2 to 3.4.1

Almost all of our customer radios are configured to do PPPoE  , nothing complicated just Network mode  NAT, wireless IP assignment DHCP, and PPPoE enabled.  We only have a single PPPoE server so Service Name and Access Concentrator are left blank, ALL for Auth , a username and password , default MTU, keep alive, MSS Clamping disabled.

Somewhere between 3.2.2 and 3.4.1 something happened to PPPoE.  We only recently started upgrading a couple of PoPs to 3.4.1 .  After an outage at one PoP a while back (the backhaul failed so the radios could not reach the PPPoE server) we had one customer (out of 10 on that PoP) that didn't have Internet after we got the site back online.  The radio was not getting a PPPoE connection. In Monitor>Network>Network Status it shows PPPoE Mode as "Connecting" but when we looked at the PPPoE concentrator we could not find were the radio was actually doing anything.  We rebooted the radio, didn't fix it. We had the customer power cycle the radio, didn't fix it.  

At this point I'm thinking it must be a problem with the DB or Radius or Miktrotik because what else could it be. But we couldn't see anything on any of those to indicate a problem or that the radio was even trying to authenticate. What else could it be though so I changed the username and password on the radio to a temp account used for installing and PRESTO the radio connected...  I'm thinking well obviously it has to be a  problem on this end with that account. Just to test I  changed the username and password back to what it was supposed to be and PRESTO the radio connected. So now I'm back to thiking it has to be the radio because we did nothing on the auth/DB end we just changed the username/password on the radio and then changed it back.

Yesterday morning I had to bring down one leg of our wireless  to replace some equipment at our main tower. The outage affected about 500 customers for about an hour.  Since then we have had 11 ePMP customers that the PPPoE appeared to stop working on . The only way to get their radios PPPoE to connect was to change the username and password, save it , the change it back and save it.  

We have a similar problem with the  pmp100 radios and the 900Mhz 450i we just recently started replacing the PMP100 with.  The PMP100 and 450i  will just stop trying to do a PPPoE connection after they can't reach the PPPoE server for a while, however just rebooting them fixes the problem unlike the ePMP radios.

I can't figure out what could be happening on the ePMP radios that a cold reboot doesn't fix it but changing/saving the username and password does.

Hi,

We are going to double check described scenarious.

I will revert back when tests are done.

Thank you.

Just an update on this.  We are still seeing this a couple of times a week and it doesn't appear to require the radios lose connection to the PPPoE concentrator like we first believed. Losing connection to the PPPoE concentrator may trigger it but it's not the only thing because the radios we are seeing do this now are just random radios that didn't lose connection to anything they just suddenly seem to have stopped attempting to auth.

We are still running 3.4.1 as 3.5 breaks the GPS location fields in SMs and appears to not be completely compatible with 2000 series APs.  

This problem seems to be completely random and we are seeing it more now that we have upgraded the entire network to 3.4.1.   While it has been completely random we did have one customer affected by it twice now.  Again I had the customer powercycle his radio and when it came back up it still did not appear to be trying to do PPPoE auth (though Monitor Network showed that it was).  I was in a hurry at the time (had other things to do) and the customer was also frustrated so I didn't do any testing. 

(1) Radio is not authenticating.

(2) Have customer power cycle radio - doesn't fix it.

(3) Go to the PPPoE setting and add a 1 to the end of the username and hit SAVE... then I removed the 1 from the  end of the username and SAVE it again and  it authenticated and the customer was online.

Next time we get one of these I'm going to change some other setting in the radio and save it just to see if changing/saving the settings  fixes it.  

As I said above we are only seeing it a couple of times a week and so far only one customer has had the problem more than once so it's not a major issue just a minor annoyance right now.

Hi,

Thank you for update. We’ve tested PPPoE one more time in the lab with long duration, but did,'t observ any issues.
Looks like issue is reproducible rather rarely and it is not easy to reproduce it.
We would be grateful if you could contact us when issue occur again and we will have oportunity to debug it on the fly.

Thank you.