Pppoe sometimes stuck in connecting state after connectivity outages

Hi all,

we have hundreds of SMs deployed in our networ, mainly with NAT enabled and pppoe authentication enabled and just a few in bridged config (with routers as cpe in charge of pppoe authentication); since the beginning (years) we face this issue: when a backhaul outage (or similar) causes a drop in pppoe sessions which persists longer than a a few minutes (let's say 5-10 minutes), several of the SMs (not all, and regardless the fw release), even after the connectivity to the bras server is back, are unable to reconnect again their pppoe. We find their pppoe section stuck in "connecting" status and not "in session".  We can recover them just by accessing them from the management interface (private static IPs) and by pushing the "disconnect" button and then the "connect" one, in the "PPPoE Manual Connect/Disconnect" section.

This is enough to solve and to have the unit "in session". Of course a complete reboot or power cycle of the unit will do the same.

The disconnect/reconnect trick suggests it's just a problem of the ppoe client daemon that after several attempts gets stuck.

This still happens with APs updated to 15.1.1 release and SMs until 15.0.3 (we don't have already massively updated clients to the last release).

We have tried to check if others had the same issue but this does'nt seem the case but we thing this is due to the "bridged" configuration often used by several ISP and integrators.

Anyone else with similar experience?

Kind regards

Rocco

Rocco - I am checking with engineering on this one... not sure if there is something we can do about it, or there is a limitation with the PPPoE protocol itself causing this.  Will report back.

Hi,

thank you for your reply.

I can add that we have also different APs with epmp1000 (also force 180 and force 200) deployed which authenticate on the same bras , also Mikrotik 5ghz cpe and hundreds of wired cpe and none of them suffer this behaviour...

Rocco

Hi Rocco,

We will have to look to recreate this internally.  In the meantime, can you please capture the CNUT debug info for the SMs that have shown this problem both when they are in the state as well as after they are recovered by manually pushing the reconnect button?   We have a thorough PPPoE log in place that we can analyze to see where it is getting stuck.

You can send that to me on here via private IM.

Thanks,

Aaron