Out Of Resources Count

I have a few customers that seem to lose their connection, though we maintain connectivity to their radio. They are natted and can ping the radio but not the internet when it happens. It usually restores itself in a few minutes. I noticed today on the nat stats page that there were Out of Resources counts for both public nat and private nat.

What do those out of resources counts usually mean?

Just now, I’ve adjusted the protocol filters, and set the broadcast/multicast rate to 1 to see if it helps. After the reboot, I so far have the following, after about 5 minutes:

Private nat:
Out Of Resources Count : 1980

Public NAT:
Out Of Resources Count : 784

I’m running 9.0 on this unit, and it’s a P10.

Thanks.

Are these customer running P2P or something else that generates a lot of flows? I’ve seen a number of NAT or connection tracking implementations that do just this when there are a ton of connections and it’s trying to keep track of them all. For example, Linux’s ip_conntrack remembers stale TCP connections for a week or so.

craigeb78 wrote:

What do those out of resources counts usually mean?


I think it is related to the fact that too many NAT sessions are requested by the clients.
Maybe they use P2P applications?

I checked under the log section, and it shows the NAT table. This persons NAT table, I’ve seen go from 50 used, to 1000 used NAT entries pretty rapidly, so I’m assuming this is causing the problem, though I’ve never seen it out of free entries.

Can anyone help me decipher this output? It seems the Port columns here do not correspond to actual tcp/udp ports, and it seems they are just incrementing numbers in both columns. In normal hardware, I would use this output to attempt to determine what traffic is causing the overflow, but without accurate port info… it doesn’t seem very helpful.

See screenshot.

Very odd that there is no destination port listed. The first pair (10.0.0.2:3418) is the information pre-NAT, the second pair (209.213.172.11:12034) is post-NAT.

Hints that this is P2P are, aside from the sheer amount of connections, the pattern in which the IPs appear and where the IPs are. Notice how 208.87.242.119 appears in that list - as chunks of data are completed, most clients will request new chunks from existing peers if they have a good upload rate. The other one is reverse DNS or the netblock. If they all turn out to be residential/dynamic IPs it’s a pretty safe bet that someone’s running P2P.

Do you think that reducing the TCP/UDP garbage timeouts will help?

My TCP garbage timeout is at 120 minutes. My UDP is at 4 minutes.

I would think the canopy NAT implementation should be able to handle anything a Linksys NAT implementation could handle.

Thanks.

Try 10 minutes on TCP - that’s a pretty long time for an application with no keep-alive to just sort of leave a TCP connection open for. I might go as low as 2 minutes.

Unfortunately the Logs page doesn’t show the NAT destination port , in fact the most important information :frowning:
Anyway, the quantity of destination IP numbers suggest that he’s running P2P (or some form of virus/worm)…

craigeb78 wrote:
Do you think that reducing the TCP/UDP garbage timeouts will help?

My TCP garbage timeout is at 120 minutes. My UDP is at 4 minutes.

I would think the canopy NAT implementation should be able to handle anything a Linksys NAT implementation could handle.

Thanks.


Yes, lowering those values with cause the connections to time out quicker.


Another thing worth noting is I've seen this behavior if the client computer has a virus/worm infection on it. Pre-SP2 machines with no TCP connection limit will hammer something until the machine itself runs out of resources.