Network Congestion or Limit?

Dave_Bradich_1 · September 28, 2007, 6:27am

About a month ago we started to see a slow down in customers speed on our network, this has progressivly worsened to where we now get speeds down to 30-300 kbps (used to be 4.5Mbps) down. The upload speed is affected but only by about 1/2. This occurs during peak internet times.

What is driving us nuts is that MRTG shows all our links and AP’s are well within spec, link tests and efficiencies are good and our internet feed is not impacting us. Our system is not large and no AP has more than 40 customers. No one customer looks to be a problem on MRTG. Interference does not seem to ba an issue.

Running Pingplotter now from one of the farthest SM’s through an AP, CMM, Moto 60 Mb BH and into the server and out to the internet shows ping times as high as 500ms to 5 sec at times. There seems to be no consitency that as to what element has a high ping time, as others could be low. Our Network Monitoring tool also flags up high latency and I can see customers going on and off the ‘down’ list (not really down, just high latency). Even my SM ping times can be high (computer to hub to SM).

In general it seems to me this occurs when an AP has more than about 1.4 Mbps of upload traffic (we are running Advantage with 75/25 so 1.4 should be about 1/3 to 1/2 the capacity). AP total download is occasionally over 3Mbps and never over 5Mbps. Control slots are at 1. Everthing is synced (one remote SM is back to back to an AP with a sync cable, the AP says it is receiving sync). We are on 8.2 firmware (except BH on 8.1.51).

The problem will be there for an hour and then disappear for 1/2 hour, re-appear, then be gone during most of the day.

We have enabled Protocol Filtering of SMB, Bootp Server and IPv4 Multicast on all customers (and some customers with PPPoE).

One thing I have noticed is that the Statistics Scheduler tab has a new display for Wrong frequency information that is in the hundreds, whatever this is.

I am thinking that this is some sort of Network Congestion or possibly reaching the capability of Canopy.

Any ideas?

sivanisky · September 28, 2007, 2:37pm

You may want to take a look at the thread I started a few days ago. Is bandwidth related as I am having your same problem but I am making progress with the tips I’ve got from others guys. The thread is named "2X rate - questions after reading the manual"

As I type, I am monitoring this one AP that has 100 SMs or so and I am watching how it is improving

clueless · September 28, 2007, 4:47pm

I don’t know if this is your problem, we had a switch go bad. our down load speeds went from 4.5 meg to 300kbps the up load to around 700kbps. After replacing the switch we are back at 100%

rjk · September 28, 2007, 5:59pm

Link Efficiency tests are good and dandy but I’d recommend running tests over your hop points. I typically run tests weekly on our major hop points to make sure there are no congestion points. I also graph packets per second on all of the routers in the network to watch for problems that could popup.

I use iperf udp/tcp network testing bi-directional. What’s funny is that a 34mbit 20mi “home brewed” backhaul performs better than a pair of BH20s I have with throughput/latency under load. 1up soekris. =) .6ms over 20 miles vs. Canopy 8-9ms over 9 miles with 100% efficiency… Go figure.

I remember when I played with Rayjunk, er, RayLink 2.4FHSS – those things sucked when P2P was introduced into the network as we started to grow. They couldn’t handle 128kbps steady upload with ~2-300pps from a single user. It would kill the AP and cause latency upwards of 800-900ms for the other subscribers on that particular AP.

vince · September 28, 2007, 6:43pm

monitor the traffic at peak times with a program like wireshark and see if there is anything else using up the PPS. It could be as simple as you have a few people doing bit torrents.

erkan · September 28, 2007, 10:41pm

Do a test. set the APs at 50% downlink.

1.4 mbps of upload in our network is cause of similar problems, probably with 75:25 you can’t go higher.

Frothingdog.ca · September 28, 2007, 11:52pm

We’ve sun into a similar problem as well. Our main BH into or Office was getting a fair bit of loss. We were running it at 50% DL but have since moved it to 25% (Since the master is at the remote end) and it cleared up ALOT of the packet loss. We’ve concluded that the BH is over saturated even thouh it doesn’t show it in our graphs.

We can only seem to get about 4.7mb out of a 20MB BH running in 2x mode before it starts dropping packets. PPS isn’t an issue either…only seems to be about 800 to 900 each direction.

We are looking into upgrading to a REAL Backhaul

Jerry_Richardson · September 29, 2007, 2:18am

Frothingdog.ca wrote:
We've concluded that the BH is over saturated even thouh it doesn't show it in our graphs.

We experienced the same thing. What we saw was rising ping times due to increased contention.

The thing about Ethernet is that is goes from working to saturated very quickly. As the link starts to drop packets, the retries go up adding more traffic.

More traffic = more dropped packets = more retries = more traffic = more dropped packets, etc, etc.

Frothingdog.ca · September 29, 2007, 4:30pm

But wouldn’t you think those packets would show up when monitoring the PPS on alink. Just weird that it doesn’t. Perhaps we should monitor both interfaces the wired and the wireless.

Jerry_Richardson · September 29, 2007, 5:05pm

I don’t think dropped packets get counted, only packets that actually get passed.

Frothingdog.ca · September 30, 2007, 12:26am

hmm…that’s a bit of a pickle then. There isn’t really much point in graping the PPS then if it’s not a true indication.

moinavery · October 2, 2007, 5:44am

I have had a saturated bh10 with 1700pps and 1900kbps agregate (voip)
Are BH supposed to handle 3000pps?

Frothingdog.ca · October 2, 2007, 8:50pm

3000 aggregate (1500 each way)…in theory.

However I beleilve that 1500 packets each way as long as they are all 1500byte packets.

Because packet sizes vary greatly this automaticly means that your not gonna get 1500PPS each way.

To test it, change the link test packet size from 1522 to say 64 and look at the difference in the amount of data you can pase. According to MOTO most congestions problems aren’t caused by a PPS issue it’s because the packets are small, which makes the wireless link in-efficient.

erkan · October 2, 2007, 10:23pm

Frothingdog.ca wrote:

However I beleilve that 1500 packets each way as long as they are all 1500byte packets.

I think that 3000 pps should pass no matter the packet size. If you have packets of 1522 bytes than you can have 3000 pps and 14 mbps.

I don't know the limit in pps in BH10, but knowing Motorola it will not be a surprise that they limited this number.

Frothingdog.ca · October 3, 2007, 11:34pm

Ya that’s what I thought as well. But MOTO told me differently.

LP1 · October 9, 2007, 10:15pm

yeah, 3000 packets is 3000 packets. Thing is, if you have so much information to send, using smaller packets that information is going to generate many more pps to send the same thing that would only take say… 20packets at 1500.

I hope that came out right