ePMP Radius, PPPoE over VLANS, Our setup!

Hi All,

First, this is just a way to both share some information that is lacking and remember how we got this working like our Canopy network as we convert to IPoE. This is not meant to be all encompassing nor the only way to do this, but just one of many. We have noted that the Cambium instructions are very high level and basically do not provide enough information and in some cases wrong information in regards to this type of setup. To be honest, I do not think that this method was ever intended nor expected. The general community and the internet abroad also does not adequately cover a lot of this information in enough depth, detail or in a way that others can follow to get the same results. For whatever reason it does not matter, we are making it make a bit more sense.

Things to note about this setup:

We do not provide a CPE-router below the ePMP-SM, our clients can purchase a WiFi router to use as a WiFi AP that we program for them for free. Or provide their own, which we program as an AP for free.

The ePMP-SMs do not support radius based accounting of traffic, hence the need to use another method of accounting for client traffic (we already had PPPoE controlled by radius so we made the ePMP devices do the same), yes we could have implemented 802.3ad vlan per-user nesting, but that would be a massive undertaking plus a truck roll to each client to reprogram the client's Wi-Fi router. Also the added overhead on the network routers was not acceptable (no sense forklifting routers at $20,000 each just to gain the same performance level that you already have. Now if the ePMP SM could be used as an MPLS PE router, then we would have a valid business excuse to for forklifting the network routers!)

Our network uses rfc1918 addresses to overlay our management network over our routed network and rfc1918 addressing for the PPPoE bridge due to available IPv4 addresses. (IPv6 is not compatible for Canopy equipment and at the time ePMP did not support IPv6 either) Clients that want a static public IP are setup as a blind bridge to the SM and this is not covered in this post. If they just want to host their own server, we provide a discounted VPS hosting solution

We are using Freeradius 3.0 on Debian Linux. This is our choice, Debian does lag behind other flavors of Linux for updated software but we like Debian's ease of use and stability so the trade-off is acceptable.

The meat and potatoes:

Setup Freeradius 3.0 just like 2.0, there are many tutorials on google for this and the minor configuration differences are very apparent and easy to understand.

You must create your own rootCA certificate and sign a private certificate with this rootCA certificate. Use of Cambium's does work but we had problems with it. For security reasons and the fact you can set the certificate to 3650 days (10yrs) means you won’t be cycling eap certificates all the time, which can be the cause for truck rolls if not done right. Make sure you convert the crt file to pem. For radius you can use any format but Canopy and ePMP radios like pem files.

In the radius config, we chose to use an SQL backend, so we had to comment out all the references to files and uncomment all sql references. This is important as files are given higher priority than SQL look-ups.

Network setup:

We use Cisco routers and L3 switches, so our method may not be correct for your network.

First our gateway router, configure the outside IP address and default routing (may not be a requirement if you choose to not default route).

Configure an IP pool for PPPoE and then configure your BBA-group and virtual-template.

On your network facing port(s) configure sub-interfaces for your vlans. Ensure you encapsulate dot1q to your vlan number. We found it easier to just use the same sub-interface number as the vlan.

Enable PPPoE on your selected vlan.

Network switches:

Create a trunk in an interface and allow all vlans. Setup a vlan interface for your management vlan and assign a management IP to it. For each AP attached to this switch setup a trunk port and only allow your management and PPPoE vlans. If you use multiple PPPoE vlans then allow all of them. For each wireless backhaul create a trunk port and allow all vlans.

At each tower setup the switch the same. Use access ports for tower side management.

Now this is a L2 network that is routed only for PPPoE traffic. This also requires that you ensure that you use larger capacity backhauls closer to the gateway to support all the PPPoE traffic.

At each tower add a router and configure one interface (sub-interfaces work too if you have multiple PPPoE requirements) for PPPoE server and the other as a full trunk. Setup your management vlan interface and IP then create a route from the PPPoE range to the gateway.

Block the PPPoE vlan from the main gateway to the tower switch on the backhaul interface.

Canopy and ePMP AP's all you have to do is set the management vlan.

Canopy SM's under vlan tab:

The Default Port VID is the PPPoE vlan, set this to your PPPoE vlan

Provider VID is the remote management vlan, set this to your management vlan

Management VID is the local management vlan, this allows local access to the management interface. If it is set to the management vlan then you will need a laptop that can be set to the vlan, if it is set to the PPPoE vlan then your clients can see the login page. Your choice on this, we do both depending on the client. If we start seeing login attempts when no tech is on site then we remotely set the management vlan to our tech vlan and then make notes in the client profile for our techs to see before they go on site.

Then on the NAT tab, enable nat and set to PPPoE, configure your client side IP and the DHCP range. We send the dns information in the IPCP handshake and the SM automatically passes this to the client. This allows us to enforce using our DNS servers which are on our network rather than passing all traffic to our upstream. This also allows us to jail a client that has an over-due account.

In the PPPoE tab we set the username, password, timer type, lower the timer period and set TCP MSS Clamping to enabled.

EPMP-SM’s are quite different though. Somethings that are set and over-ridden by radius must be set before radius vsa's are uploaded.

Setup the radio tab like normal, setting the scan channel list and set power control to manual. (This makes initial tuning and link establishment easier, nor required but helps on long or nLOS and NLOS links)

Under QOS, no settings are needed but setting Traffic Priority enabled and enabling VOIP Priority allows better VOIP and streaming service (yes I know it doesn’t see streaming the same as VoIP but we have noticed that it does help if someone is torrenting while trying to stream)

Under Network, General box: set to NAT, Wireless IP to DHCP.

Ethernet Interface box: This is your client side network setup. The only thing you must make sure you change is the Preferred DHCP DNS Server must be the same as the IP Address in this box.

In the Separate Wireless Management Interface box:

enable Management IP, set to static(for some reason it won’t take a DHCP IP, keeps complaining of an empty DHCP relpy even though we use this DHCP server all over the network), set your management IP, enable management vlan and set to your management vlan, set the priority to at least 7 that way you wont get ignored if the SM is busy.

This next part seems odd but it won’t work unless it is done:

In the Virtual Local Area Network (VLAN) box: enable the data vlan and set to Vlan ID to your PPPoE vlan and set the priority, then save the config and then simply disable the data vlan. The config stays but is not used. There seems to be a problem with radius VSA's not being able to overwrite blank configuration information.

Under the PPPoE box:

enable PPPoE, you can leave the Service Name and Access Concentrator both blank as they are actually not needed nor used, because of how PPPoE works the first AC to make an offer is used. setting these parameters causes the SM to request that AC and service, but if they are not available it will take the first offer (this could be a bug in the pppd program version that Cambium has used, we don’t have this issue when we tested the Debian 7 or 8 version of pppd on our network). Authentication was a bit of a stumbling block as by default Cisco does not allow PAP or CHAP even though configured. Cisco routers are looking for MSCHAP or MSCHAPv2 first and PAP seems to not be taken. If you set authentication to ALL then the SM asks for valid auth methods from the Cisco router and the router replies with all auth methods configured. The problem is that MSCHAP and CHAP are enough different that they cannot authenticate properly. Setting the authentication to CHAP tells the Cisco router to use basic CHAP and authentication works properly. (We had similar behavior from Debian 8 and 9 PPPoE server software, we do not use Microtik or UBNT routers so they have not been tested).

Set username and password like normal, MTU size should be 1492 unless you need the overhead. Keep alive time is hard to quantify, we set to 10 as this is a good refresh period for our PPPoE servers. The longer you set this the longer it will take to recover a dropped session. This should be the same or slightly less than the PPPoE server session time-out timer (ours is 15sec with auto-cleanup). MSS clamping should be enabled to force local packets to be less than 1492, but is not strictly required.

DMZ should not be enabled unless you need it.

Under advanced box, only NAT Helper For SIP should be enabled, disable all else.

Under the Security tab: enable RADIUS, make sure you have your WPA2 key entered (fallback mechanism)

Under the RADIUS box: set the eap-ttls username as you have in your radius config. This could be the SM mac address or anything and does not need to be unique (but unique usernames allow SM's to go missing without compromising the network, simply disable the eap username in radius)

Set your eap-ttls password.

The Authentication Identity String and Realm are only needed if you setup realms in radius, else they should be blank. This is important, if you’re not using realms then make sure these boxes are blank.

Certificates tend to give the most problems for new users. To make things work properly and the first time, delete the Default and Canopy root certificates. They are not actually gone from the SM, just not able to be called. In the User Provisioned Root Cert 1 box add your rootCA cert that you created for radius. This must be a pem file. When you upload the certificate you may get a json file to save, this is normal and it is supposed to be a pop-up window that says the certificate uploaded properly or failed. Now the next thing is to save your config, you should not get an error but it does happen and is still happening in firmware 3.5. You can ignore it and your settings are saved, just the config counter is not what is expected due to your certificate upload.

Now power cycle the SM. This is required to set several configurations and make the SM available to radius for control. Now when the SM authenticates to an AP, it will get its bandwidth profile from radius. Which leads to the last part.

We used Daloradius as our initial SQL primer. Let’s face it Lian Li has done a lot of hard work to make it fairly easy to get the right information into the SQL backend and to maintain that data. We still use it to set new parameters but we also use a more-friendly PHP based webpage that we wrote to manipulate basic information. Using Daloradius, we imported the dictionary into Daloradius and created a profile for each package and for each radio type, so Canopy radios have Canopy packages and ePMP radios have ePMP packages. Canopy radios only need the QOS settings in the reply, ePMP radios in our setup required more:

Cambium-ePMP-VLIGVID client side network port, should be the PPPoE vlan

Cambium-ePMP-VLMGVID PPPoE client uses this vlan to find the PPPoE server, set to the PPPoE vlan.

Cambium-ePMP-ULMIR this is the upstream bandwidth from the client’s perspective

Cambium-ePMP-DLMIR this is the download bandwidth from the client’s perspective

Cambium-ePMP-VLManagPVID should be 7

Cambium-ePMP-VLDataPVID should be less than 7 but more than 3

Cambium-ePMP-VLMG2VID remote management vlan, set this to the management vlan

Cambium-ePMP-VLMG2PVID should be 7

Each attribute should use the += operator and the target needs to be reply. By default all users have Fall-Through = Yes as one of the attributes sent. This is important to understand because without it only one VSA would be applied. It is anyone’s guess as to which one if more than one as every time is different. The += operator tells the SM to add each VSA until the operator changes or no more VSAs are available.

When creating users, the only attribute needed is the check attribute for cleartext-password and it must use the := operator. All other user configuration is done in profiles and plans. A profile is a bandwidth/network configuration and a plan is accounting for the users data transfer.

One last thing to note is when creating a NAS (AP that is allowed to talk to the radius server) set the NAS type to other unless the NAS is actually listed. This changes the expected behavior and format that the NAS expects. The IP must be the IP of the AP and the shared secret must match the radius secret in the AP. The short name is not actually needed but by setting the device name in the AP configuration and copying it to the NAS shortname, it speeds up the NAS lookup and authentication as well as allows you to know which AP you are changing the secret for.

With this setup, you do not need to reload the freeradius server every time there is a profile change, NAS (AP) addition, or new user added or an old one removed. If you are using files instead, then every time you make a change you must reload freeradius.

We also do our own data cap monitoring thanks to the PPPoE servers sending proper accounting information. Simply having a script check each client’s accumulated data against their profile cap and changing the MIR profile that radius sends to the SM is straight forward. We are still working on a way to automatically drop a wireless connection on exceeding data cap, but for now we receive an email and we manually deregister the client. We have yet to get an SNMP trigger to send a disconnect message as we don’t actually know what the management IP actually is until we login to the AP and see it. We have looked at recording the IP into the SQL table for the user, but that has met with interesting issues that cause other things to break as well as if the IP is not updated properly, we have had the wrong client drop out. We have another script that looks for a flag in the SQL tables, it clears the counter and restores the profiles automatically, the counter is added to a new table that keeps a history of total bandwidth per month for each client, thus allowing us to trend clients usage and contact clients that the package is not right for their needs. Each AP is rebooted by snmp every billing cycle start causing all clients to re-authenticate and gain the correct profiles and restore bandwidth profiles to full speed. We leave you to create your own scripts and methods for enforcing data caps. We will not be providing our scripts as they are custom to our setup and not portable (we've had troubles with changing from one version of freeradius and SQL server to the next.)

This may not be the only way, nor the right way. Just one that works to make ePMP behave like Canopy. Our plan is to remove PPPoE as soon as ePMP supports Radius Accounting with Start-interim-stop so we can check data cap and adjust profiles as required. A method for Packet-Of-Death for the SM wireless connection and an SM authentication retry timer variable would be nice too, yes that was a request dig!

A word of warning, this is our setup and is not meant to replace yours. It is meant to help you on your road to a functioning system that you make for yourself. Users that disapprove of this method, please don't comment. We know this is not a nice way of doing things, but its what WE have and I am sharing a method and a way. That is all.

Users whom have changes or modifications, please share. I would like to see this knowledge grow and persist rather than be thought of as some SysAdmin blackmagic or network voodoo.