ePTP Ping

Hello guys,
I have lots of ePTP links which works very well, but under load their ping times are slightly high instead of 1-2ms under low load.
Is there anything I can do to low ping time?

That’s normal of and TDD device. As your frames are more filled with actual content from clients, the pings will sometimes have to wait for the next frame causing some delay to happen.

Is there anything to have a better ping?

Depends on your capacity, future needs etc. Usually odd spikes are caused from bursty traffic, flow control can cause it too. Check your switches for a lot of pause packet ticks, if you see a lot of them, you can add a router to ease buffer bloat. If you don’t see them or have flow control turned on you can move to a bigger backhaul like the 650 (it has much shorter faster frames than the epmp and a lot more capacity. ) or use lacp and add another epmp link (I personally don’t do that for a number of reasons) If your comfortable with routers and you’ve got routers able to cause per session queues, you can use your router’s to slow things down a bit before it comes to your backhauls. The TCP mechanism will try faster rates first (cause a sudden full frame) and get backed down from slow TCP acks. If you’ve got a router slowing the sessions down before hitting hardware limits, you can keep your pings lower as your through put climbs. It’s not going to make a massive difference, but can help. If your at 2/3s capacity of the backhaul, your going to have climb no matter. Keep in mind the frame structure in the epmp line is designed to keep TCP traffic smooth over icpm. You may be picking at a problem that isn’t there. What kind of speed tests do you get to the Internet from behind and Infront of that backhaul when it’s busy?

Thank you for your detailed answer!

This is the traffic on this backhaul during a typical day:

 daily-1.gif

As you can see, its traffic goes up to 70Mbps 5min average, so it can hit about 90Mbps.

It's a very short link (less than 200 meters) and its traffic it's 99.9% MCS15 (from Performances screen) both on download and upload.

I set the channel width to 40MHz to have more bandwidth and to decrease ping times a little bit. It helped, but I can see an high jitter.

You can see ping jumping from 1ms to 12ms, sometimes even 25ms.

I would like to have a stabler ping without having to spend so much money on a PTP 450 or PTP 650, which I have on other links with far more traffic.

There are already routers at each end of the link, flow control is disabled, and traffic can't reach the max backhaul capacity.

Do you see any retransmission counts on either end ? And how is your frame configured? 50/50, flexible etc ?

Flexible will give you the most bandwidth, but as usage climbs the scheduler will shift around a bit. Sudden changes to up and download in the link does that. In eptp mode, you don’t have any control over that. If TDD mode, you can fix a schedule and have more consistency in the pings. We hardly use eptp mode so we can use GPS on our light PTP links. For prospective, a 30 frames per second video stream is 1 frame every 33.3ms some spikes to 25 shouldn’t hurt but if you’re after smooth obove all else, try 75/25 or 50/50 in TDD mode and 2.5 Ms frames. Your average time will go up, but should be very consistent as long as you don’t have retransmissions happening.

Retransmission is about 10% on 40MHz channel. It was lower (3%) on 20MHz channel but I need more bandwidth to reduce ping spikes.

ePTP doesn't have such settings, you can't limit MCS, you can't set frame, you can't do anything.

I'm sure a limit in MCS could help on ping spikes.


I tried TDD PTP 75/25 or with flexible frame, but the ping is 8-13ms and unfortunately that's too high for me (there are also other hops)...

10% is to high in my opinion for a backhaul link. If I were in your shoes I’d be moving channels or keeping the TDD frame. You’re 3ms higher than the post you put before and would held better stability and give you the option to force down modulation.

The load you posted and the climb you see in pings related to throughput I would expect in a 20 MHz channel. The 40mhz wide will still be up because of your payload. Your frames are getting more utilized. Defining stability in your latency is one of the intentional purposes of TDD mode. Eptp mode doesn’t have as tight control from my understanding of it, but still better than a wifi based system.

Another option you could attempt if your dead set on not moving your backhauls to TDD is using a 2nd set to make your link full duplex leaving the eptp frames full set one way. Routers can direct your traffic down one path and up another. Downside if you’ve got two paths and if either one fails, your down hard.

Another method would be to ring around your backhaul path and breach the furthest link with ospf, or rstp and double your through put that way, while adding redundancy. (Much better, but not possible if you can ring your links)

Has a customer noticed this or are you just seeing your pings are a little elevated at night ?

I’d be more worried about the experience a customer has over the equipment not responding to pings quickly. (Ping through the link instead of hitting the link)

Thank you for your suggestion!

I never thought a double link could low ping times, and I didn't try that. If you use a link only to transmit data, even TCP ACK should return from the other side, so you should get a super stable ping... I should try!

About this link, I managed to find a better channel, and now I'm having 2% retransmission on 40MHz channel.
Ping times are stabler even at 70% load (I tried to generate traffic).

I have a customer I replaced from MikroTik (it was nv2) to Cambium ePMP.

He's an addicted gamer and he's constantly checking ping times and jitter on gaming servers, and he's complaining about ping times higher than the "old" antenna system. 

Using MikroTik it has spikes from 1ms to even 20ms, but, as it was a not so heavy loaded AP, it has an avarage ping time of 2-3ms to the AP.

Now using Cambium ePMP he has 8-10ms on the AP, so I wanted to try to reduce backhauling ping times to try to make him happy.

Now I reduced jitter on backhauling, but he's still complaining about ping times. I don't thing I can really do something else, and honestly I can't get mad for a single customer that noticed a ping increment, but he's asking to replace the antenna with the old one and that sounds very strange to me: it's the only customer that prefers the "old" technology!

Customers are so insatiable :-) 

I had a client that was complaining of the 3ms increase on his connection. I did everything I could think of to reduce the ping times, all I did was change him from a canopy FSK-900Mhz system to the epmp1000 system before the complaints, the average latency to my edge is the same even though the docs say 7ms on the fsk hardware. I even run an iperf server at each tower to allow testing of each location from basically anywhere in the network. The problem was not on my network (though I am also not saying your issue is exactly the same) but on the transit network upstream of mine. I still cant prove it to this client, but after profiling his connection many times the proof is fairly irrefutable. I do experience the sudden and hard to catch in the act slow-downs that are in every network, but these happen so fast that even live netflows have a hard time seeing them, the only indicator to prove it even happened being my port queues are near full and slowly catching up.

Think of this as an oppertunity, you can charge this client more for forcing you to support non-supported equipment on a link. If you dont own this tower, you can explain that you upgraded to be able to supply more bandwidth in more packages and keeping the older system will cost you the operator more for that location and as the sole client requiring this system, the burden will be passed on to him. A minor difference in latency will alway give way to a personal wallet when the numbers start to reach stupid per month.

If you own the tower, then the only real arguement you can make is the hardware has no warrenty and if it fails the client may expect issues as the equipment ages.

I still have my FSK900 system as there is no real reason to move the scada equipment off it, but for the majority of new deployments I only offer the current system using my ePMP network unless its a new scada site where it does not make sense to tie up a slot on the epmp network for such low bandwidth requirements.

1 Like

I really appreciate your feedback.

I have powerful routers on each tower (MikroTik CCR) so I can always check if there is a bandwidth issue or jitter (they have bandwidth tool test, traceroute, ping... etc). 

In my situation the client noticed the ping increase and jitter during online gaming, and unfortunately the issue is caused mainly by this ePTP link I was talking about.

Before changing his technology to ePMP he didn't notice that because the old AP was lightly loaded and had better ping performances than ePMP. He wanted an upgrade from 10Mbps we were offering on MikroTik to 30Mbps we offer on Cambium.

He has rock stable 30Mbps profile, but he's sad about the "lost milliseconds" :-)

I have to remove the old equipment in this case because I don't own this tower and I had a small number of users, so I can't leave him there, and I couldn't offer 30Mbps.

I talked with him and try to explain the reason, and he seemed to understand, but I have a request for Cambium after that: 

Why don't you offer an option to reduce latency on TDD even by reducing capacity or by disabling GPS Sync?

In some situation it could be very usefull.

For example in PMP 450i (we have some tower for high end business profile) I can see an average ping of 6ms from the SM to the tower, with only 1ms jitter. This AP has a few users, but on ePMP you can't reach these results even using Flexible, 2.5ms, one user, and all MCS15 traffic.

keep in mind the silicone used to make these differing products. The Atheros style chipsets in the epmp hardware were not designed for this kind of abuse (and yes they are being abused). The pmp450 series is a chipset that is specialized to this kind of work type and load in the lower ms ratings (pmp450 is capable of 10km link at 6ms round trip, the epmp is 7ms one way or 15ms averaged(there is a slight delay in turnaround for some reason)). The older PTP400 series used something similar to an ASIC to move packets without involving the CPU any more than required. The old PMP100 was similar but not quite, its closer to the wifi radio idea though , mostly limited in the number of frequencies used to make the channel. The PMP450 and epmp use much larger channels and better encoding schemes to pass the bandwidth. I am sure the Cambium guys will correct this as this is their expertise and I havent touched on the inner workings since spending time in schaumburg with the old crew 5yrs before the split off from Motorola. Most of the new hardware info is whats publically available and what I have picked up from various sources off forum so some I know is not quite correct.

From my experience, if you have more than 100 STAs on an AP you will notice an increase in latency as the AP has hit 50% processing power. Since these are basically the same thing as a wifi router for home use only differing software, at 50% though not topped out, they need more time to do things thus causing additional latency. Others here will have differing experiences with this line but this is what I have noticed and now strive to keep below the 45% mark for number of STAs per AP. I have found dual AP antennas designed to minimize tower costs, KP Performance in Alberta make a dual 60deg sector that allows two APs to share an antenna backplane but have seperate radiating elements. They even make dual frequency sectors but these are only 90deg coverage. My loaded sectors are approaching the requirement of using these antennas just to keep the AP use down. If you use a cmm or a packetflux timing system, its fairly painless to add APs in this manor and doubles the capacity available. 

Not that this is the way you should go but it is one way to ensure lower latency caused by packet processing and this also means a higher per tower subscriber ratio. Depending on where you are in the world, I am rural mostly with towns I cover, dual radio sectors start to look good on one side of the tower where STA density is greater. Make sure you plan your frequencies to account for the F/B reuse. The dual sector AP's can be on the same channel if the SSID is different and the timing settings are 100% the same. Use of a common timing system is heavily suggested.

You can reduce the ping in the epmp with 2.5 Ms frames. You loose about 10% of your speed in exchange for those few milliseconds. The latency isn’t caused from sync, it’s caused from the MAC. The epmp Mac in principle works similarly to other TDD systems in concept. Overall system gain are enormous.

With gamers and complainjng about pings, we tell them under 100 MS is the our network goal and leave it at that. Realistically 3ms is zero impact to their experience. We use such a wide number to keep them from bothering us about 3ms. Something 99% of them have zero concept of that degree of time.

If by MAC you are refering to the digital converter used to provide the input to the radio frequency encoder then I would agree that it is a serious point of additional latency. These converters work similarly to the ramdac in your video card when not using a digital connection. Since radio is not digital in how it is being used for the epmp system, a converter is required. These converters are set to run at a speed that provides stable error free use not at full potential. The latency adder that I found was a more serious offender was the multiple access scheduler, this is software based and runs on the cpu and in ram. Pinging an ap with no SMs yeilds very low sub 10ms pings, but adding one SM caused the pings to average 12ms and at 10 SMs the AP averaged 19ms from my desk to the AP. Now I just put up a new link and all it has on it is the c3825 router. Running RIP and ISIS (yes we still use rip) pings are averaging 22ms on an 8mile link. Adding a fairly large data stream caused the link to average 38ms while the stream was running. A link to another tower averages 40ms with hroups as low as 14 and as high as 72ms. Average data over that link is a continuous 20MB minimum. The best way I have found to deal with AP to SM latency issues is to check the load on the AP and if its over 100 SMs, I subdivide the sector. Having a policy of 100ms or less is a good way to discourage complaints that are in some cases less valid than they are made out to be and also prevent complaints that are totally invalid. Eg, clients own voip sessions drop a lot of packets, the client calls in with a speed test screen shot to argue the increased latency as being the cause. This claim can be invalidated if the ping time is less than 100ms. For the record voip can handle a very large latency as long as jitter is held close to constant.

Mac = media access control

The portion of the first network layer response the organization of the information streams ingressing and egressing the radios wireless interface. Coding is done when the MAC elements instructions the transceiver to due so

The 100ms standard stems from so many aspects we can’t control. Generally customers are to our fiber exchange in 40ms or less. After that, we can’t control what’s happening anymore. 0% packet loss is the only acceptable loss standard for us.

Epmp gives preferred network time (provided the APS is configured correctly) and should prevent dropped packets. Latency on VoIP can be controlled with jitter buffers to make large swings in jitter not a problem. With the right tweaks it can even work on satalite (viasat offers VoIP now)

Media access control is a low level protocol that dictates the format of low level communications. In ethernet the protocol is ethernet and it uses the cs/cdma format. For your home wifi it uses a hybrid of ethernet with additional format data to handle the radio medium. With fibre you have a plethora of formats and protocols to use, most stick with ethernet thought sonnet is still used and as we develop ways of signaling more data fibre strand we develop better formats and protocols to handle the medium.

Thus is why I sought clarification as the MAC layer is still several steps away from the power amplifier and each step adds a bit of latency. The major places of latency development are fairly easy to locate but almost always harder to reduce by any significant amount, especially when you consider the physical chipset used in this product.

I am actually glad to hear your networks policy is 0%. That is a goal that is fairly lofty and very hard to actually acheive. There is always some packet loss as a network grows and is utilized to the acceptable operating ratio. I am not trying to pick a fight with you, you have envinced that you know what you are doing.

Jitter buffers are a nice way to smooth problems over but try keeping one full on a network that routinely sees 100Mbps fd links hit 92%. Some loss happens and buffers allow for it, it is enevitable but we can control it to reduce losses to absolute minimums. I have yet to see an ISP network that had less that 0.1% known lost packets, mine included.

For my network, I dont worry about what I cant control. I control everything up to my edge and I have a contract for upstream service thats proven with a service level agreement that I track and enforce. I ensure a level of serviceability by ensuring I have enough bandwidth from multiple upstream providers and peer with networks that meet our peering policy, which is very selective. This ensures my clients will have no reason to go to the competition. We strive for 100% contentment of our clients, a happy client is a long term client. 100ms to get off my network I think is still reasonable though we are always looking for ways to reduce this to the bare minimum so that our clients not only have a highspeed connection but it has to feel fast too.

We cant even begin to compair our networks as they are very much different. From our choice of hardware and routing protocols to how we time our APs and select bandwidth ratios. The business policy that dictates network policy and etc. It is an apples to bananas comparison.