Anyone else using EAP-TTLS with ePMP?

Which radius and eap-ttls implication are you using. We demo’d a windows based system but had nothing but issues. Went back to freeradius3 on debian and cant say there is any eap-ttls issues.

We do have the odd radio stop passing data here and there but it is not specific to a hardwarw generation and is a lot less with 4.6 0.1 on everything. We have a mix of e1k and e3kL APs and just about one of every type of SM out there except the f190s.

We are also moving away from nat at the radio as we provide managed wifi to a lot of our clients now, makes things easier for us too.

I’ll have to check specific releases, but IIRC, we’re running CentOS and Free RADIUS from about 5 years ago. Built the VM when we started moving to ePMP and that is its sole purpose.

The centos version of freeradius had issues (centos based iircc).

My suggestion: If you are using an sql backend, spin up a debian 10 vm and test it.

For the archives…

Late to the party here, but we found that the N-based devices (Force 100, ePMP1000, etc.) would complete radius requests if you ignored CHAP just fine. But if you don’t actually complete the CHAP challenge/handshake on AC-based devices, they go into an infinite loop and never connect. Our config was completely ignoring the username/password CHAP challenge and just authorizing on EAP and MAC, so we ran into the issue with the first Force300s deployed.

Hope that helps.

Never seen that on our N based devices…Definitely true for the AC based devices! Just tried it on the test network and it failed spectacularly! We use the MAC as the username so this is probably why we didnt see this before.

@khoff Are you saying that the F300 doesn’t connect at all or that it may intermittently get stuck in a loop?

Our issue is that there is no failure on the RADIUS server. The F300 stays connected to the AP, it just stops passing traffic. We’ve tested on multiple firmware versions, multiple AP revisions, multiple Linux OSes and multiple Free RADIUS builds. We’ve even simplified our responses to enable/disable (rather than extended responses with MIRs, etc.)

We can recreate this scenario easily.

F300 doesn’t disconnect from AP, so you can look and say “geez, this thing has been connected to the AP for 40 days, this customer must be crazy.” When we put more granular monitoring in place, we find both management and customer traffic will stop forwarding at random. Rebooting the unit restores connectivity or if you just wait until the next key exchange, it typically will start forwarding traffic again. The more traffic moving across the link, the more often it will stop forwarding traffic.

In the case I was describing, the F300 would connect enough to begin the TTLS handshake, but on the RADIUS server, you would see and endless loop of the same RADIUS requests.

What you’re describing sounds like a bug we reported when using the F300 with RADIUS and VLANs on earlier versions of the firmware (pre-4.5.0). When the F300 would re-authenticate after a few hours, it would stop passing traffic on the data VLAN. A power cycle would resolve the issue. Turned out that the script that runs on the radio to set up the VLANs was broken on re-authentication (it was passing the wrong arguments to brctl or something like that). I looked it up and that ticket was from May 2020. Ahh, here it is in the release notes from 4.5.4…

ACG-9628 Fixed an issue for when the data traffic is not passed via the DATA VLAN on a
reconnecting SM whose wireless security setting is set to RADIUS(EAP-TLS).

Hope that helps.

2 Likes

Here are the scenarios we can break the radio in:

straight bridged.
bridged with separate mgmt and data vlans
nat with a shared mgmt & data vlan

We obviously haven’t tested every scenario. The commonality above is EAP-TTLS. If I switch to WPA2-PSK, problem goes away.

We are not passing VLAN data back in our RADIUS response. Those are manually programmed into the radio upon deployment.

Our ticket has been open since March 2020. I came here to see if anyone was successfully running RADIUS. Considering 2 users responded, my assumption is most users do PSK due to simplicity and either manually enable/disable customers, or run through a walled garden or Powercode BMU, etc.

Thank you for taking the time to respond.

In our case, it was bridged SMs, data/mgmt VLAN configured on the CPE (not by RADIUS VSA), and EAP-TTLS for auth. Switching to PSK resolved the issue in testing. Are you running 4.5.4 or newer on the SM? If not, start there.

If it helps, you can reference our ticket 214959.

We can reproduce the issue all the way up to 4.6.2.

Some of the top level Cambium techs get auto-generated email blasts from us when our test units drop. They’re well aware there is an issue. I was hoping someone else found a work around because “we’re working on it” as their monthly update is getting a little old after having this product in release for 3 years and us identifying the issue for them 2 years ago.

I think they might be going down UBNT’s old faithful road of “see we fixed it” by releasing a completely new product line, such as the 4000/400 series.

Out of curiousity, have you entered the management and a data vlan on the radio prior to using radius eap-ttls? if not, you should since radius VSAs update the current config and it must not be null. We had that issue when we first setup EAP-TTLS on our network. Because our NAS’ are Cisco, we were digging in the Cisco documentation and found that config changes must not be null, set as 0 is ok , but null and they wouldnt get updated.

As for traffic just up and stopping without the wireless side dropping, we found the majority of our repeat offenders had power issues that would lockup part of the radios. Not 100% sure if that applies to your issue, but it is something to consider.

Yes, the VLANs are being populated prior to connecting.

I would think that if RADIUS was populating the VLANs there would be indications such as the vlan reporting differently, non-vlan populated bridged connections still working, etc.

The radios will self-correct if left alone. In my experience, that doesn’t happen with radios that lock up due to brown outs.

Also, we’ve ruled out power by running test behind on battery backup.

Do you have smart speed enabled?
This little “feature” caused us a lot of grief until we disabled it. Similar issue to what you report, zero data passing but radio has great uptime, quick reboot solve it for now and then seems like randomly just stops again.

After this, I would be guessing hardware issues but that would need to be correlated with serial numbers, and mac addresses to see if you got a batch thats acting up.

We can break it with or without smart speed being enabled. Furthermore, I would expect smart speed to break ethernet connectivity, not wireless interface connectivity.

We’ve supplied Cambium with all the MSNs. I think we had 11 out of 250 that were tagged for bad hardware.

It’s without any doubt 100% related to EAP-TTLS.

Your response months ago encouraged us to try harder, but I’m not sure what else we could do on our end.

Our last lab experiment was full-release RHL with fresh free RADIUS and a statically built database with only one AP aimed at it. Can’t get much simpler than that and I can easily break an F300-25 or F300-16 while F200 & F180 chug right along. This is with both a 2k as the AP and a 3K-L.

I think I have an idea now, there is a known bug in the RHL versions of freeradius, Spin up a debian vm with freeradius3, drop your certificates and the key into the correct folder and somewhere on here I made a full post on how to make it work, find and follow.

This one, perhaps?

Thanks Simon,
Yes that is the one I was thinking of. And yes I know its not fully detailed but it has the information needed.

We noticed our customers having different issues with Radius. And it depends on network configuration. We fix it one by one. Some are already fixed in 4.7.
It is better to open support tickets to let us know how exactly it does not work.
Thank you!

Andrii, 213502 has been opened since March 19, 2020. We’re exhausting all avenues of help.

1 Like

I know about this case. The fix is scheduled after 4.7.1
And please remove my address from mailing list cos I daily get emails about your radios up/down state.