PPPoE frequently don't connect

Hello,

I have seen that sometimes ePMP antennas don't connect to my PPPoE server. I've tried with tcpdump and it doesn't try to connect. My guess is that if there is packet loss in PADI (PPPoE Active Discovery Initiation) messages, the antenna doesn't try again.

Could it be?

I think it should continue trying. I have checked that when I deregister(or reboot) the antenna, when it reconnects it sends PADI again and the connection happens.

Thanks in advance.

I've tried with 3.0.1 versions and the problem exists too.

Thanks

Hello,

The PPPoE client in the subscriber module should perform retries in the event of a PPPoE session disconnect. How quickly a session disconnect is detected and a retry is performed will depend on your Keep Alive timer ( http://community.cambiumnetworks.com/t5/ePMP-Networking/ePMP-PPPoE-Configuration-and-Troubleshooting-Considerations/m-p/49604#U49604 ).

Regards

Hi,

Thanks for your answer, but I think I haven't explain correctly. The problem I have is that when SM connects to an AP, frequently it doesn't start PPPoE session(I haven't seen any PADI messages from it). I think it's not about the keep alive session because the first session doesn't happens.

Thanks

So if the wireless connection between the AP and the SM is dropped, when it comes back up, the PPPoE session is not re-established? If so, this is not expected behavior. The PPPoE client in the SM should detect when the wireless connection has come back up and attempt to re-establish the PPPoE session.

Do you see that in every SM using PPPoE or only in some of them? Did it start heppening with 3.0? Would you share the configuration (configuration backup file content) of a SM doing this? You can send it via PM.

Regards

Hello,

It happening in part of our SM. I think I haven't explain well. The SM does start PPPoE session (sometimes) after when the wireless connection has come back up(or when it reboots).

The problem is that sometimes when wireless goes up, or when SM boots, PPPoE session doesn't happens. Like I have said, I have use tcpdump in the AP and I don't see any PADI message. It doesn't happen everytime, that's why we don't know where the problem is.

I'm sending you the backup via PM.

Regards.

Hello Luis, we're still having this problem. Any news?

Regargs, Mario.

I haven't been able to duplicate your issue in the lab as of yet.

Hello Luis,

Could you confirme that SM must try to connect every X minutes even if no PPPoE server answer the PADI message? Because that's what I'm seeing: SM is not sending PADI requests.

Thanks

Hello Mario,

Attached is a screenshot of my capture. I used tcpdump utility available in the ePMP AP CLI and was capturing in the eth0 interface. Basically, I had a PPPoE session already established and then I stopped the PPPoE server servicing the SM for some time. You can see in the capture that while the server is down, the SM PPPoE client continues to send PADI messages. When the PPPoE server is brought back up, the PPPoE session gets re-established, as expected.

Hope this helps.

Luis

I don't know. We can reproduce the error (in part of the SM that are connected) if we unplug the PPPoE server from power. Maybe the problem occurs if the SM doesn't receive a PADT message(termination message).

Could it be?

Regards,

Mario.

Hello Mario,

The protocol should allow for this via the KeepAlives (Echo Request/Echo Reply in my capture). Even if a termination is not received, from the PPPoE client perspective, if it sends X number of KeepAlives without receiving a reply back, it will assume the PPPoE session has been terminated, and it will re-start the process of re-establishment.

What are you using as a PPPoE server?

Regards

Hello Luis,

I'm usig CCR from mikrotik.

Regards, Mario.

OK. I'm also using Mikrotik, just a much older product (RB1200).

Hello Mario,

I have the test still running but have not been able to reproduce your issue. If you can reproduce it on a SM and leave it on that state, without affecting a customer, we may be able to remote access the unit and gather some information.

Regards

It seems that if discovery fails for whatever reason the subscriber won't try to rediscover again.

In this case the log shows 2 discovery attempts.

During the first one the subscribers was not even registered, and during the second we temporarily disabled the PPPoE server for the test.

Sep  1 00:00:38 customername snmpd[3145]: Reset driver.
Sep  1 00:00:54 customername pppd[4375]: Interface ath0 has MTU of 1492.
Sep  1 00:01:07 customername snmpd[4842]: DFS status: N/A
Sep  1 00:01:09 customername pppd[4375]: Unable to complete PPPoE Discovery
Sep  1 00:02:52 customername pppd[4375]: Unable to complete PPPoE Discovery
Sep  1 00:03:22 customername pppd[4375]: Interface ath0 has MTU of 1492.

Uptime is now 30 minutes, the PPPoE server has been reenabled, but the subscriber hasn't tried anything yet.

What is the supposed rediscovery period?

Hello,

Once the SM detects the PPPoE session is broken, it will initiate transmissions of PADI messages every 45 seconds. Each attempt consists of 3 PADI messages, 5 seconds apart (see capture below).

1 Like

Yes, but i'm not referring to broken sessions. I'm talking about the SM just rebooted and no session has been established yet. The behaviour seems to be completely different in this case.

Hello innova,

Which describes your scenario?

1) SM reboots, no AP available so SM cannot establish a connection with AP for some time

2) SM reboots, AP is available so SM connects to it, but no PPPoE server available

3) SM reboots, AP is available so SM connects to it, PPoE server is not initially available but then comes online after some time OR

4) SM reboots, AP is available so SM connects to it, and PPPoE server is available

If the SM does not have a connection with an AP, the SM should detect that the wireless interface is DOWN and may stop attempts to initiate a PPPoE session. Once the wireless interface is UP, the SM should resume attempting a PPPoE session.

Regards

I think it is scenario 1 when it happens in the wild.

But my test was more like scenario 3.