I didnt mean to post in this section, but oh well… Maybe the staff can move it over to general discussion?
I was wondering if anybody has seen something like this happening on their network. The symptom is at intermittant times, for about a minute to two minutes, an entire cluster’s throughput goes to 0. By that I mean no data can be transferred between SM and AP. I found this to be caused by a TON of broadcast packets from my SMs.
We have a Canopy cluster of 4 AP (5250), with about 400-500 SM. The busiest AP has 159 subscribers. We run HW scheduler, and the SM are a mix of P7,P8, and P9. All run the latest 7.2.9 firmware. Almost every SM is attached to a consumer router from Linksys, D-Link, Belkin, etc. We use private addressing, and a central linux-based NAT engine.
I have recently logged packets from our bandwidth-hogging broadcast storm. The cause is a SINGLE packet from one of these “consumer” routers. I have a binary pcap file available if anybody is interested in trying to figure out WHY this happens, but here’s a quick breakdown…
One router sends out an “igmp v3 report” message, to a L2 multicast address. Canopy treats this as a broadcast, and every SM gets it. Next, many other routers respond with an “icmp protocol 2 unreachable” message. Whats worse, those icmp replies are addressed to an ethernet broadcast! So every SM hears that too! The effect is that all bandwidth on the whole cluster (since its switched in a CMM) goes to 0 for the length of the storm.
Here’s the first few packets of the storm. There’s over 400 icmp replies to ethernet broadcast per second. I counted two igmp report packets at the start of each storm, each spaced one second apart.
16:33:03.072748 00:40:ca:38:1a:75 > 01:00:5e:00:00:16, ethertype IPv4 (0x0800),
length 60: IP 172.26.40.113 > 224.0.0.22: igmp v3 report, 1 group record(s)
16:33:03.084267 00:06:25:9a:98:64 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800),
length 74: IP 192.168.1.1 > 172.26.40.113: icmp 40: 224.0.0.22 protocol 2
unreachable
16:33:03.096916 00:0f:b5:25:b3:81 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800),
length 74: IP 192.168.1.1 > 172.26.40.113: icmp 40: 224.0.0.22 protocol 2
unreachable
16:33:03.111053 00:0f:b5:ec:f6:83 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800),
length 74: IP 192.168.1.1 > 172.26.40.113: icmp 40: 224.0.0.22 protocol 2
unreachable
So now, the “IPv4 Multicast” filter on the advanced page WILL prevent this, because it blocks the very first “igmp v3 report” packet. However, a few of our installers regularly forget to set the filters, and so the storms persist.