Lost AP Connectivity - Strange UPDATE!!

I just started up two new sites - nowhere near each other (over 90 km apart). Both sites have the same basic configuration - 45 Mbps BH in and out of the site, CMM Micro, 6-2.4 Advantage AP’s, 1-900 Advantage AP with Omni antenna, anaged switch in the building below, utility power with UPS. Both sites have different color codes, VID’s and subnets. One of the sites works 100% - been testing for about 2 weeks now and no problems found. The other is a pain in the ass!

That’s not entirely true. Everything works fine except for accessing the AP’s. At first I thought it was the infamous “web server problem”. But I am using a 10.61 subnet. Also, it’s every AP. If that’s not enough I can’t even telnet into them once this happens. When they are first powered up everything is fine, connections are normal, access to all config pages is there. After a few minutes I can’t talk to them at all. Yet they still return pings and respond to SNMP inquiries. They pass SM packets no problem. All I can do is cycle power via the CMM.

I have no packet filtering turned on at these sites. Using 7.0.7 on every AP and SM. VLAN enabled. Software scheduling. I can’t think of anything that would cause this.

Anyone have any ideas? The CMM maybe? Weird.

Thanks everyone.

Aaron

What about accessing the configuration pages of the SM’s (through the Lan1 Network Interface Configuration/RF private) when the AP is not accessable? If this can be access I am at a loss. If those cannot be access, what about looking at your managed switch? Could it be having arp issues of some sort?

I’m using NAT on the SM’s and don’t the RF Private interface turned on. I suppose I could turn it on on a couple SM’s and see what happens.

Aaron

Okay, I have done a bunch of testing on this and more to come. I have replaced the CMMmicro as well as the managed switch this week - same issue.

When the AP’s first boot everything acts normal. After a few minutes (anything from 5 to 20) I lose http and telnet access to them. I still have access to the CMM and all traffic passes normally to the SM’s. The AP’s still return pings the entire time. If I keep accessing the web pages (ie. autoupdate) they keep responding. Well, the 1/2 hour test I did showed that anyways.

I have the original switch and CMMmicro in my office now. I’m going to program a couple AP’s exactly the same as at site and see if I can recreate the problem.

I am very frustrated with this. :x

Aaron

Aaron,

Where on your network are you trying to access the AP’s from when you experience this problem? How are your VLAN’s setup on both the AP’s and the managed switch? The micros are called “managed” switches but the one feature that I believe they are lacking is the ability to view the ARP cache table. That would probably be a helpful tool with your problem.

Have you tried connecting one of the AP’s directly to a laptop and seeing if the problem persists? What about the ftp server in the AP’s, did you try seeing if it still runs when you can’t access 80 or 23?

Matt

It doesn’t seem to matter where I access them from - right at the bottom of the tower while I’ve been working on it. For managment I use VID 1 but all packets are untagged (just using port based for managment and all customer traffic is segmented up with tagged packets.

Haven’t tried connecting directly to an AP yet - going back to site today. Just tried to connect to the ftp servers and nothing. I have the original CMM and switch in my office, tried recreating the problem yesterday and no go - but I don’t the backhauls in connected here or have any subscribers (although with this new site there’s only two on there anyway).

I am going back to site right away so I will try a few more things - have to sort this out soon.

Thanks for your suggestions.

Aaron

Okay, lot’s of time wasted here for what turned out to be a stupid problem.

After a lot of f&cking around, I narrowed the problem down to my managment VLAN - keep that from accessing the AP’s and all is good. Hmm. :?:

Okay, further digging. I use Solarwinds to manage all of my gear via SNMP. I remember when I added these new AP’s to the list I had problems seeing the interfaces and supported MIBs. At the time I blamed it on the network to that site and ignored it. Got to thinking about it again Friday afternoon when I got back to my office. I turned off the SNMP polling for those radios, rebooted all of them and they ran damn fine all weekend. WTF!!! :?

More pissing around this morning trying to figure this out. I have a couple AP’s in my office (a mess right now, well, always) and configure them like they are at that site - freqs, color codes, VID’s, next couple in the subnet. Same thing - no talkie talkie after adding them to SNMP server. I try rebooting lots of stuff - everything. Same thing. :evil: Son of a …!

Then I get to looking, step by step, through the configs of the problem AP’s and one other site - which I know I’ve done already.

So here’s the explanation…

On my managed switches, to NOT configure the default gateway I CAN’T enter zeros - I have to enter the switch’s managment address in as the gateway and it zeros that and accepts it. In the Canopy radios, you CAN enter all zeros. I enetered the AP’s address in the Default Gateway field (only on this site, all others are zero) for some reason (thinking switches). Everything worked fine until the first SNMP poll by the server - then all hell breaks loose, or locks up! :twisted: Change that to zeros (or the real gateway, or anything else) and it all works. Weird anomaly. :wink:

So there you have it, Sports Fans. Problem solved. Testing is still in progress, but I think all will be well.


Happily,

Aaron

So if you don’t want to set a default gateway for a Canopy module, what do you enter in the Default Gateway field? All zeros? I think I have tried this before with no success?

That’s correct - if you don’t to enter a Default Gateway, enetr all zeros (0.0.0.0) in the field. The AP’s will accept this as will the SM’s and BH’s.

Aaron

It is SNMP. There is a major bug with the Canopy units and SNMP. We were using Intermapper and our SMs would work fine for a while then freak out and go crazy with session counts and session reregs. We switched to Nagios, same thing. We trned off all SNMP and the network has worked fine!!!

Intermapper admits they are seeing this in different markets, but we have proven that it is SNMP and Canopy regardless of what network management tool you are using.

One other thing for you to check, do you have Allow Only Tagged Frames enabled? If so, you will lose 100% connectivity to the AP. We proved this ourselves. Not a good thing to turn on.

I do not have SNMP turned on the SM’s so I never had a problem there. Other than these AP’s I have never had a problem with SNMP stuff on Canopy. Things have been fine now for a couple days (yesterday and today) at that site so I’m happy.

I never had the Allow Only Tagged Frames turned on as I am running some gear that does not support 802.1Q VLAN tagging. For some reason I think I tried turning this on (using a VLAN capable switch to talk to them) and the AP’s would not respond to the tagged packets. Weird. Maybe I’ll try on a spare AP today if I get a chance.

Aaron