We have a weird issue with a new Force 425 link and a Force 425 link on a test bench.
Tower Layout
Customer Router (PPPoE Client) <-> SM <-> AP <-> Switch <-> Force 425 <-> Force 425 <-> Switch <-> PPPoE Server
Test Bench Layout
Router (PPPoE Client) <-> Force 425 <-> Force 425 <-> PPPoE Server
While customer routers can connect to PPPoE and get online and move traffic, some sites and other connections time out or are extremely slow. Some of our work-from-home customers report that their Citrix or RDP connections are not connecting or have 10000ms latency. While most sites work, those that don’t work seem to be secure business sites. Customers can get to login pages, but after that, it is hit or miss, some pages consistently work, some consistently do not work, and some work sometimes. We are able to reliably reproduce the issue when going to PayPal’s Activity page, the login page and other pages work fine.
When we bypass the Force 425 Link on either the tower or on the test bench setup, it works great. If we run an EoIP tunnel over the Force 425 link and have our PPPoE traffic run through that, it also works great.
When it is not working correctly, Wireshark shows a lot of Duplicate Acknowledgements.
Any ideas, insights, or assistance would be appreciated.
Not sure if related at all but we have had a very similar problem a couple of times in the past. In our case none of our customers could get Netflix website to load, fast.com would load but the speed test would not run, Youtube would load but videos would not load or buffered constantly, Amazon would load but they couldn’t log in , amazon streaming wouldn’t work (netflix either) facebook would not load and while they could reach banks etc… websites they could not log in. it was pretty consistent in our case as far as what could or could not be accessed was the same for all customers.
It was, with one exception, a problem of fragmented packets related to using PPPoE and MTU size.
It was always the same websites that would or would not work each time it came up. Once was due to the PPPoE client on the ePMP radios, twice due to Mikrotik updates.
The one exception was a block of IP addresses we received were on some black hole list with Cogent and the results were almost identical and took forever to track down because at first we just assumed it was the old PPPoE / MTU issue.
I have 400c not 425 but I have one tower with about 160 users all set up like:
customer equipment > ePMP SM (PPPoE Client) > ePMP AP (2000’s and 3000’s) > switch > Force 400C Slave > Force 400C Master > switch > PPPoE server and it has been working great for several months now.
We tried troubleshooting MTUs on the link and the PPPoE server, as we too thought that it was the most likely cause. If we lowered MTUs we experienced the same fast.com and youtube symptoms that you described.
Right now, we have the MTU on the PPPoE Server set to 1480, with effective MTUs on clients at 1480. The MTU on the Force 425 Link is currently set at 1700, though we have tried other values.
I don’t believe that this is an upstream issue. None of our clients have issues except for those run through the Force 425 link. If we bypass the 425s, it works normally for those clients as well.
We are running 5.1.3, though we have tried downgrading to 5.1 as well. Your setup is almost exactly the same as ours. It’s good to know that it normally works, I am just not sure what else to try.
Hello @Fallsnet-NNC,
I will PM you with a firmware containing potential fix. Recently we did a lot of work with protocol priorities and queuing, improved NSS acceleration. It should cover your case. If not we will need a remote session to debug your issue.
Hello
Please if i can get the firmware too.
I have the same problem. I put the link in production yesterday.
Today, users started reporting problems with access to various sites, especially users who have access to the official portals of our country.
I would be very grateful if I could solve the problem as soon as possible.
Thanks in advance.
Best regards