@Dmitry Moiseev wrote:
So it appears that all of the problems we were having were due to the SA being on.
Thanks for the heads up and sorry that we caused you trouble with that.
OK well after 4.4.3 being back on the AP for about two weeks now we have had some problems. This morning I get a call from a customer on that AP saying they had no Internet. They reported "I've been randomly losing internet for about two weeks but it's usually only for a minute or two but this morning it's been down for 20 minutes now".
So try to log into their radio and can't reach it. I log into the AP and it shows it's only been up for just over a day (so it rebooted a day'ish ago) I log into the BH Radio and it has been up for weeks , so it wasn't a power issue the 3000L rebooted. I looked at the cacti logs and just the one mystery reboot about 1.5 days previous, but a day before that the logs also show a rash of dropped client connections...
Anyway back to the AP, all 9 clients are connected but I can't access any of them. The Monitor > Throughput Chart shows that not a single bit is passing , not a single bit, 0 in 0 out.. well that seems impossible since I'm remoted into the radio and bits would have to pass in/out in order for the radio to show me no bits are passing in/out.
So I reboot the AP, only it doesn't really reboot it just logs me out. At this point I download a support file from it (seems to take a really long time) and I try to reboot several more times but it just logs me out each time. So I try to SSH in but no response.
I drive to the site and power cycle everything ( the site consists of:
F200 Backhaul <-> Power supply <-> Power Supply <-> 3000L
nothing else, not even a surge protector.)
It appears everything comes back up, everyone has internet , everyone is happy.
15 minutes later I get an alert that AP is down. I can't reach it but I can reach the F200 that it's connected to and the ethernet port is flapping (1000green, 1000red , no connection). I can access the AP as long as the BH is 1000green, I try to grab a support file but can't keep connection long enough (seems to take longer than normal to generate support file). I did manage to make it to the firmware page and start 4.4.1 firmware upload and after a bit I manage to log in, see the firmware is waiting for a reboot and I reboot the radio (and it works this time).
The AP came back up 4.4.1 , that was about 3 hours ago and so far it looks like everything is good.
The only customer I talked to during the outage (several called the office to report it was down) was the one with the F300-16 (the one and only AC radio on this 3000L the other 8 are N). So I don't know if they are the only one having the "internet randomly dropping for a minute or two". I do know that a week ago we put up our first 3000 AP with the 3000 Sector and Beam Forming panel (I know it doesn't work right now) and conneted my home to it with a F300-16 and immediatly noticed that my internet was dropping for just a minute or so a couple of times a night.
I'm not even going to mess with trying to track the problem down I'm just going to roll everything back to 4.4.1 and see if the weirdness goes away.
If the internet drops again before I roll it back I'll grab a support file off my F300 and 3000AP in case support wants them.
Edit: Update (Monday 1/13/20) - So the Micropop has been on 4.4.1 since Saturday morning. I talked to the customer with the F300-16 a few minutes ago and she says she hasn't noticed it dropping since I fixed it Saturday. While the radio shows it has been up since Sat (1 day, 23 hours) the graphs show it rebooted this morning at 9am... But it took almost two weeks for it to go completely wonky so no idea if 4.4.1 is any better than 4.4.3 or not.