I have a 4500AP with 5 SMs in a pretty good MU-MIMO favorable distribution.
I ran a 2 SM 20s link test to two SMs that are very far apart on the azimuth. Both are near MCS10/11.
On a 75/25 split and 40mhz channel the AP showed near 90+% MU-MIMO traffic but each SM only got ~160Mbps. I then did a 3 SM test and each got around ~160Mbps. That was lower than I remember.
I then connected to the router behind one client (mikrotik) and ran a bandwidth test back to our core and with just that SM doing traffic it got a little under 300Mbps, which seemed low as it is getting MCS11/10 easy. While it was running I ran a 2 SM link test to two other SMs and watched the traffic in realtime on the Mikrotik router drop to under 30Mbps while the link tests to the other SMs again was around ~160Mbps.
I then rebooted the AP and after all SMs reconnected I did the same tests and this time with a 2 SM link test I got around 600-700 total on the DL with each SM getting around 300+.
I then did the same test where I had the one SM’s mikrotik router doing a DL bandwidth test and alone it would get ~345Mbps and during a 2 SM link test (two other SMs) this one dropped only to around ~240Mbps instead of 20Mbps and the AP showed that during that 2 SM link test with a 3rd SM doing the Mikrotik bandwidth test the AP was near 800-900Mbps where before the reboot it was only pushing around 500-600.
UL/DL rates on the SMs did not change after the reboot and the AP during this time was under very low load <10mbps average.
I did take a support file from the AP before reboot but forgot to take screenshots of the link tests before the reboot.
Here is the link test charts after the AP reboot. Before the reboot the same tests were around 1/3 to 1/2 less. The ~100-300Mbps was me running the mikrotik bandwidth test on one SM.
Here is the SM distribution.
One thing I did just notice is that the SM UL rates were higher and more stable after the AP reboot. Not sure if that was affecting the DL bandwidth of the SMs.
Chart shows TX/RX rates from AP perspective. So TX is SM DL rates and RX is SM UL rates. Yellow vertical line is after AP reboot and SMs reconnecting.
We’ve noticed some strange issues with having MU-MIMO turned on the 4500 8x8 integrated. First off, when MU-MIMO is being used, beamforming gains are nullified, meaning that the 6dB (2-3 increase in modulation levels on downlink) is not available, this means that all the SM’s participating in MU-MIMO are possibly using a lower modulation level. Another consideration is that due to chipset limitations, only 3 groupings are available, despite it being an 8x8 radio. We ended up disabling MU-MIMO and going back to beamforming only and it seems to perform better over time. I think that the ePMP dev team still has room to optimize MU-MIMO and we’ll try again when there’s another major update.
@Eric_Ozrelic I do not believe this has anything to do with the 6db SM gain difference when doing beamforming vs MU-MIMO. Even if the SMs went from MCS11 to MCS8/9 I would still see around 200-250Mbps per SM when having 3 SMs push max DL bandwidth but that is not what I saw. Also the 6db gain loss would still be present after the AP reboot so the fact I saw double to triple aggregate AP bandwidth after the reboot doesn’t seem to line up with the beamforming vs MU-MIMO gain loss.
My guess is that the scheduler was having issues (especially UL) and the unstable UL rates which did coincide with each SM having higher than normal UL retrans rates was messing up the DL bandwidth and MU-MIMO.
Look at the charts for 3 SMs below. Each one started seeing high UL retans % at the same time and it went away at the same time, when I rebooted the AP (red line). After the reboot and the UL rates and retrans % improved back to normal and the DL rates and MU-MIMO were working normally again.
I do not know the exact correlation and impact the UL retans % can have on AP DL bandwidth and MU-MIMO and its exact cause. UL retransmissions are usually caused by interference at the AP side. As rebooting the AP fixed it I do not believe it was sudden interference from another AP or wifi device that also went away the exact time of the AP reboot, but instead self-interference from something getting out of wack on the TDD scheduler causing the SMs to interfere with each other at the AP side.