Help with ARP broadcast

I used ethereal to capture broadcast packets on my network and found lots of ARP broadcast. Below is a sample of the capture file which was taken for 60’. Most packets have different source IP (my radio IPs) but have same info 'Who has 10.0.0.254?'

I am no expert on this issue and I need all the help I can get to fix this problem, which is causing my internet traffic to be so slow.

Please help
Thanks
Greg

No. Time Source Destination Protocol Info
1 0.000000 10.0.0.101 Broadcast ARP Who has 10.0.0.254? Tell 10.0.0.101

Frame 1 (60 bytes on wire, 60 bytes captured)
Ethernet II, Src: 3a:00:3e:f1:c4:7a, Dst: ff:ff:ff:ff:ff:ff
Address Resolution Protocol (request)

No. Time Source Destination Protocol Info
2 0.002701 10.0.0.84 Broadcast ARP Who has 10.0.0.254? Tell 10.0.0.84

Frame 2 (60 bytes on wire, 60 bytes captured)
Ethernet II, Src: 3a:00:3e:f2:88:ad, Dst: ff:ff:ff:ff:ff:ff
Address Resolution Protocol (request)

No. Time Source Destination Protocol Info
3 0.003362 10.0.0.85 Broadcast ARP Who has 10.0.0.254? Tell 10.0.0.85

Frame 3 (60 bytes on wire, 60 bytes captured)
Ethernet II, Src: 3a:00:3e:f0:dd:da, Dst: ff:ff:ff:ff:ff:ff
Address Resolution Protocol (request)

No. Time Source Destination Protocol Info
4 0.006565 10.0.0.158 Broadcast ARP Who has 10.0.0.254? Tell 10.0.0.158

Frame 4 (60 bytes on wire, 60 bytes captured)
Ethernet II, Src: 3a:00:3e:f1:c3:a3, Dst: ff:ff:ff:ff:ff:ff
Address Resolution Protocol (request)

No. Time Source Destination Protocol Info
5 0.007660 10.0.0.146 Broadcast ARP Who has 10.0.0.254? Tell 10.0.0.146

Frame 5 (60 bytes on wire, 60 bytes captured)
Ethernet II, Src: 3a:00:3e:f1:c3:a4, Dst: ff:ff:ff:ff:ff:ff
Address Resolution Protocol (request)

No. Time Source Destination Protocol Info
6 0.008585 10.0.0.111 Broadcast ARP Who has 10.0.0.254? Tell 10.0.0.111

Frame 6 (60 bytes on wire, 60 bytes captured)
Ethernet II, Src: 3a:00:3e:f1:c3:f7, Dst: ff:ff:ff:ff:ff:ff
Address Resolution Protocol (request)

No. Time Source Destination Protocol Info
7 0.012541 10.0.0.239 Broadcast ARP Who has 10.0.0.254? Tell 10.0.0.239

Frame 7 (60 bytes on wire, 60 bytes captured)
Ethernet II, Src: 3a:00:3e:f2:88:c3, Dst: ff:ff:ff:ff:ff:ff
Address Resolution Protocol (request)

No. Time Source Destination Protocol Info
8 0.014465 10.0.0.71 Broadcast ARP Who has 10.0.0.254? Tell 10.0.0.71

Frame 8 (60 bytes on wire, 60 bytes captured)
Ethernet II, Src: 3a:00:3e:f1:8e:09, Dst: ff:ff:ff:ff:ff:ff
Address Resolution Protocol (request)

No. Time Source Destination Protocol Info
9 0.015515 10.0.0.247 Broadcast ARP Who has 10.0.0.254? Tell 10.0.0.247

Frame 9 (60 bytes on wire, 60 bytes captured)
Ethernet II, Src: 3a:00:3e:f6:fd:fa, Dst: ff:ff:ff:ff:ff:ff
Address Resolution Protocol (request)

arp traffic is pretty standard. it looks like .254 is your router? if it is this is just standard stuff.

I would look for other traffic that would be causing your slow down…like a lot of ICMP packets or IGMP requests.

as a %age how much is arp ? Ethereal should give you those stats.

If .254 is your router’s IP address then yes, this is normal traffic. I’m not too familiar with using the Canopy radios as NAT appliances, but if an SM is put into NAT mode and you still have the option to set the Bridge Timeout parameter, try setting it to something high like 1440 minutes.

If the SM receives a packet and it needs to be forwarded out to the Internet, it needs to pass that packet to its default gateway, in this case your router, which in turn has a routing table which will know to route the packet out to the Net. The SM needs to know the router’s MAC address (it already knows the IP address, obviously, from its configuration, hence why it asks for the MAC address via an ARP broadcast) in order to fully assemble the packet before it sends it to the router. The SM can cache MAC addresses. The SM will look to its cached MAC addresses for an IP-to-MAC correlation FIRST. If it does not have one, it will broadcast.

So, the longer you can keep the MAC addresses in the SM cache, the less ARP broadcast traffic on your network.

I would still go with vj’s suggestion and look at the percentage of overall traffic is ARP broadcasts. Viruses that send data and download data from the Internet would also cause excessive ARP traffic. Any packets that need to be sent to your Internet router, if the MAC of the router is not cached, will result in an ARP broadcast. If you have a heavily populated network with a lot of browsing and a short MAC cache time, then this would appear to be normal. If the network is not used heavily, then I would check the percentage rates and do some further investigation.

Hope this helps.

When I start ethereal I get ARP% approx. 65% & UDP% approx.30%. 30 seconds later ARP & UDP% are more or less the same approx. 50/50%. Of course we have other types but very negligable.
Greg

ARP is not slowing your network. It may be a bad RF link, interference, bad cable, mismatched ethernet interfaces, etc.

If you can describe your network, we can give you some ideas on how to isolate the bottleneck.

You might start with doing some link tests between AP’s and SM’s, BH’s, Remote AP’s to make sure that you are getting expected throughput on each link.

The problem is not from the radio network because I ran my tests from the NOC. I connected my laptop to the same switch where the radio network & the internet gateway also connect. The internet remains slow and ‘heavy’. I suspect that one or more clients on the radio network are doing something to affect everyone’s internet connection (ARP broadcast, DoS, something). Browsing is ok ok, not the best but manageable but if you try to download an attachment from Yahoo mail for example the throughput is 0.9KB/sec and that applies any time day or night. Very frustrating.

I am instaling Solarwinds Engineering Toolset today to monitor throughput on the radio network. I am also installing ntop and acidlab to help understand my network better.

How big is your network… .how many subscribers…

NAT enabled/disabled, software version ?

We have around 400 clients (400 SMs). The vast majority of SMs are doing NAT. Some SMs are connected to a router at the client end. A few are open (no NAT, no router) because the client PCs are setup as DHCP clients requesting IP from a DHCP server (prepaid platform gateway controller) located centrally at the NOC. We are running firmware 7.3.6 on all radio equipment.
Greg

if its consistantly slow day and night i would start to look more at hardware problems…

if it was a customer screwing stuff up you would usually see it in bursts.
if it was a dos attack u would most def see it with ethereal.

if it was an RF problem i would think you would see it in bursts as well, and possible get better at night.

If it was a bandwidth issue again, during peak hours you wouldnt notice the problem.

I would try rebooting everything, maybe one night leave everything off for 5 minutes and see if that improves anything.

I have seen switches/routers slow way down then fail

What is your connection to the Internet?

Having client PC’s connected directly to your network is asking for trouble as there is no insulation. You may be right, your network may be getting hammered with spew. You either need to run NAT on the SM, or if the customer needs advanced routing functions, a broadband router. We have had issues with Netgear, but DLink and Linksys seem to be pretty good.

FWIW - we tried Solar Winds - CactiEZ BLOWS IT AWAY (and it’s free with unlimited nodes). Monitoring, Thresholds, MAC Tracking, Weathermap, Canopy-specific templates, Cisco Router templates, NTOP built in (HUGE!), Web Server built in, runs on Linux for stability. We put it on an old Compaq DL360 and it’s chugging along just fine.

To try it out, grab a machine that is not in use and install CactiEZ. You can download the .iso here: http://cactiusers.org/ Scroll down to CactiEZ. You don’t need to know a damn thing about Linux (which was a requirement for me as I don’t have time to learn a new OS). It’s fully self installing. All you need to do is change the password, install the IP address, and add devices.

We monitor uptime on all routers, servers, switches, BH’s, AP’s, and business customers paying for monitoring. We graph the above except for switches (seemed redundant to me).

We are considering graphing all SM’s so that we can quickly locate a spewing connection, but that’s alot of monitoring bandwidth.

If you disconnect the canopy network from the router, reboot the router, and then run a speed test, does it get better?

Jerry’s suggestion is a good one. If you disconnect your Canopy equipment from your network and your bandwidth improves significantly, then you have some digging to do. If your download speeds are still slow, then you have an issue with your core switch or router.

If the speeds are still slow you could connect directly to your router to rule out your switch.

What kind of pipe to the Internet do you have? You could also telnet to your router and ping an IP address on the other side of your connection and see what your times look like, or check your interfaces for errors.

We just had an ARP issue that killed our network, found the switch at the main NOC was the issue, changed the switch and all was good.

Jerry, I graph all SM’s and have not seen an overhead isuue, but then again I am still quite small.

You may just be lucky :slight_smile:

We had one a few months back that killed all the SM’s on the AP. We have not really seen any broadcast storms since we started enabling the IPV4 Multicast filter.