20.2.2 BUG with the NAT table?

Hi,

we upgraded ~hundreds of SM to 20.2.2.
ALL of them with NAT enabled now experience a NAT table going up to full in few hours, anyone can confirm?
It seems like timeoute of session is not respected and SM keep those sessions open.

This apperared for any SM after upgrading from 20.2.1 to 20.2.2

Strange! From 20.2.1 to 20.2.2 the only change in this area was the frequency of logging to flash when there was a Translation Table alloc failure. We will see if we can reproduce and let you know. In the meantime, if you would send me an engineering capture from the SM?

I just verified in the lab. The NAT table entries get deleted from NAT Table when they time out.
Could you please share couple of engineering captures from the SM,

  1. Before the sessions were about to time out and
  2. After the sessions were supposed to be timed out and deleted from the table, but did not.

Thanks for your help!

Hi,

just sent to @Charlie our files.
We checked in tens of SMs this morning and ALL OF THEM have the same behaviour.

1 Like

Thank you very much!
I’ll follow up with Charlie.

We confirmed the problem, it was with a another change that was made in 20.2.2 to avoid crashes with DSCP. What will happen is once there is a free entry in the NAT table, later entries are never having their timeout decremented. Once the NAT table is 100% full, all the timeouts will start decrementing again. You will see a periodic graph of increasing NAT entries, then decrementing, then incrementing till full again, etc.

If you have NAT SMs you should not run 20.2.2. We will fix this ASAP.

If you have a MicroPoP Connectorized AP you need to run 20.2.2. Everything else should stay on 20.2.1.

Thanks to @MW_WISP for finding and reporting this so quickly.

1 Like

Do we know if this got fixed in the 21.* release?

Yes, this issue (internal tracking CPY-17395) got fixed in release 20.2.2.1. This fix then got put into 20.3 and every release after including 21.X.

2 Likes