System Slowness

Over the past few months our system has been slowing down, particularly on one AP. Customers are saying that download speed (to our server speedtest.net Mini) is always less than 1M bps and upload can drop to 50-100k! I can also see the slowness when I log onto their SM page. The speed used to be 4.5 down and 1.3 up for 1X customers. I am looking for some advice on what might be causing this.

What is frustrating is the traffic graph of the AP never shows more than 2M down and 1M up. (None of our AP’s shows more than 5M down and 1.5M up.) This AP also has no SM more than 3 miles away and we use dishes on them all.

Our system is all Canopy Advantage AP’s and BH’s, has SM sustained rates of 1 or 2 Mbps down and 1 or 0.5 Mbps up, depending on the package. (Our SM Burst allocations are 5,000 or 10,000 up and 100 or 200k down.) We have Disabled Prioritize TCP ACK for our VoIP. SM Isolation is on for all AP’s. We do Sync to GPS. Control slots are 3, range is 17 miles, downlink is 75%. Everything else is default. Our busiest AP’s are only 40 subs and we are using 2.4 GHz, with some 5.4. Where possible (about 1/2) we enable 2X.

This particular AP cluster is fed by two back to back Motorola 20M Back haul’s that shows a maximum traffic of 5M down and 1M up (from the subscriber perspective). All these traffic graphs are cacti 5 minute average graphs. The only unique thing about this AP is that the CMM won’t let it negotiate a link speed other than 10M Half Duplex. We use an Allot NetEnforcer at the main feed of the system to limit P2P traffic to a certain pipe size and prioritize our VoIP gateway traffic. We have plenty of internet pipe at our server.

Any ideas or suggestions would be most welcome. Or any other information needed.

Thanks.

Start with a couple test customers and kill the netenforcer, just pull it outa the mix and see if its screwy

What do the AP Ethernet Stats look like? It’s possible you have a bad cable that is causing excessive errors flooding AP, CMM, or entire network.

What do link tests look like?
- AP-SM’s
- BHM-BHS

What do ping times look like from the NOC to
- 1st BH
- 2nd BH
- CMM
- AP
- SM

you could turn the AP off from the CMM and then re-test so see if things improve up to the CMM.

Pretty generous with the burst - I’d set that to 16000 down (2MB) and 8000 up (1MB).

Thanks for the ideas. I turned the NetEnforcer off last night and had some customers do some tests. There speed was about the same and if anything a bit worse. So I think we can rule that out.

Jerry, the AP Ethernet stats were showing a high count of Outerrors, so I forced both the CMM and radio to 10 Half Duplex. The errors reduced to zero over the past hour. Still waiting to hear from customers if that improved their speed.

Ping tests are all OK. Running at 6-16 msec over from NOC to BH’s, CMM and AP and 19 msec to SM’s.

Link speed tests are all full speed.

You got me to look at the Ethernet Link on all the AP’s and I am seeing 4 AP’s (on 3 sites) showing almost 100% errors another 1 AP showing no (i.e. zero) Outtraffic (I have had to reboot this last AP occasionally as although everything looked right, subscribers had no traffic).

What would cause these two conditions?

These are AP’s that are on different towers connected to different CMM’s?

What are you using for cabling? Who terminated them?

Jerry, a big thanks, looks like the first problem was solved, a link negotiation problem as evidenced by Late Collision Errors…

Yes, here is one tower site data (sorry the Excel doesn’t copy well):

AP41 AP42 AP46 AP47

CMM Type Last Mile Gear w/ HP ProCurve Switch
CMM Link Speed Setting100baseT Full 100baseT Full 100baseT Full 100baseT Full

AP Link Speed Setting Forced 100F Forced 100F Forced 100F Forced 100F
Ethernet State 100baseT Full 100baseT Full 100baseT Full 100baseT Half

inucastpkts Count 47,967,124 51,737,297 151,872,518 1,478,573,670
Innucastpkts Count 3,063,779 2,509,841 6,826,982 50,425,486
inerrors Count 451,234 2,141 47 337
% Errors/total Packets 0.88% 0.00% 0.00% 0.00%

outucastpkts Count 41,000,797 43,323,474 139,909,405 1,291,383,631
outnucastpkts Count 39,696 24,724 44,308 256,407
outerrrors Count - 43,184,455 139,306,090 0 0
% Errors/total Packets 0.00% 99.62% 99.54% 0.00%

CRC Errors 451,017 1,959 0 331

HP Switch ports for AP41 and AP42 showing just a few thousand errors for about 500M packets each way.

Another site has just the one AP connected to a HP Procurve switch, interestingly the switch port shows 0 errors.

(The other AP showing 0 for outdata only is now showing data. Strange.)

Cabling is done by my VAR, all good quality CAT 5 Outdoor cable, fully grounded.

Time to watch the game.

The data is kind of a mess, the following should be more viewable.


------------------------BG41------------BG42------------BG46------------BG47

inucastpkts Count—47,967,124—51,737,297—151,872,518–1,478,573,670
Innucastpkts Count—3,063,779-----2,509,841 ----6,826,982------50,425,486
inerrors Count-----------451,234----------2,141--------------47--------------337
% Errors/total Packets----0.88%----------0.00%----------0.00%------------0.00%

outucastpkts Count–41,000,797—43,323,474—139,909,405–1,291,383,631
outnucastpkts Count-----39,696---------24,724----------44,308---------256,407
outerrrors Count--------------0------43,184,455----139,306,090----------------0
% Errors/total Packets-----0.00%-------99.62%---------99.54%-------------0.00%

CRC Errors--------------451,017-----------1,959----------------0--------------331

Are these still having a problem?

If so, try bypassing the lightning protection on the Ethernet runs.

Yes, I believe we do have cabling issues on the AP called BG41.

The strange thing I am on the BG42 AP with 99% errors and speed is OK. So I wonder if the AP’s are reporting the errors backwards?

Doubtful. It’s just a counter in the radio.

I’d try bypassing the suppressors so the switch is directly connected to the AP to rule that out as a problem.

You should be able to go 100FDX from the switch to the AP with almost 0% errors.

Thanks everyone.

I set the AP’s to Auto Negotiate and they showed they were only achieving 100H (when before they were forced to 100F). So I reset them to Auto 100H and the errors were lowered to less than 1%. Still need to go up to the tower and fix the cabling but at least we don’t have to do a special trip for this.

Appreciate your help.