Set to ePMP PTP radios :
Firmware : was 2.6.1 now 3.2.1
TDD PtP mode
RSSI d/u (from Master) -55/-60
This link is on a brand new PoP with 2 customers and it moves very little data for most of the day as both customers only use the service in the evenings.
We have a program we call NAG that pings each device on the network 5 times every 15 minutes. If a device fails to answer all 5 pings then I get a text to my phone. A few weeks ago I get a text that the slave end of this backhaul and the two AP's and the Pi behind the backhaul were unreachable. So I remote in and sure enough I can't reach the slave nor any devices behind it. I log into the master radio and it shows it has a link and all the stats look normal .
Since I couldn't do anything else (and by now one of the two customers had called to report his Internet wasn't working) I rebooted the master radio. It fixed the problem. I could reach everything and all tests came back good.
A few days later I get a nag that it could not reach one of the devices at that site. I remote in, and while I'm able to reach the slave radio it is really slow, takes a long time to load the web interface ... well I'm running 2.6.1 at this point so when I say slow I mean even slower than usual. I started pinging the device and I'm getting about about 30% packet loss and sporadic high latency.
OK , I don't have a lot of time to mess with it so I upgrade the firmware from 2.6.1 to 3.2.1 reboot and all seems good.
2 days later another Nag. Again the slave radio loads really slow and lots of packetloss/latency.So I Deregister the slave from the Master and when it reconnects everything is fine.
A day later and another Nag. Again the Slave radio loads really slow and lots of packetloss/latency. It's early morning so I decide to leave it and if it gets worse. Every 15 minutes I get a Nag telling various devices at that PoP can't be reached and after about an hour nothing can be reached. I try to bring up the slave radio can't reach it. I log into the master radio which shows it has link and everything looks fine. So I decide to run a link test just to see how much the two devices are able to talk to each other ( I mean, it shows it has a link with the slave so they are passing at least some data).
The link test comes back 15/9 . I run it again 30/12 and again 60/40 and again 105/43 and every test after that it comes in around 100/40 and the link is now running like it should. The next day I get a Nag and just log in and run the link test, 20/9 then 40/20 then 100/40 and it's fixed again.
So, any ideas what might be going on here ?
Completely unrelated to this I noticed that when I was running 2.6.1 the link tests were in the 150/45 vacinity vs 100/43 on 3.2.1. However under both versions of the firmware I can push 177Mbps / 48Mbps UDP over this link (testing between the mikrotik routers at each end of this link) consistantly. Thats when the link is running correctly, not when the link is acting up of course.