8.2.7 = UNSTABLE / MAJOR MEMORY LEAK

We have been encountering issues since upgrading to Canopy OS 8.2.7

We started seeing issues whereby customers were complaining of very slow throughput. After investigating the matter, we discovered a possible bug (memory leak) that renders the AP in an almost unresponsive state – it is pingable but depending on how long issue has been draining the resources on the AP, you are presented with one of the following scenarios when you log into the AP (directly via web browser - regardless of browser type):

1. You are able to log in and all appears as normal. You see the SM count on the main page, for example 75, but when you click on the “SESSION STATUS” tab, you are presented with a BLANK page as if no sessions are active. HOWEVER, when you click on the “REMOTE SUBSCRIBER” tab, you are presented with a list of all 75 SMs that are actively registered to the AP.

2. IF scenarion #1 has been ongoing for a while, you are able to log into the AP but when you try to click on any of the HTML Links in the AP’s menu, you are presented with a blank, HTML page with the words, “OUT OF RESOURCES” at the top.

We have also noted “OUT OF HEAP SIZE” errors in the LOGS of the APs":

17:22:24 UT : 08/09/08 : File src/osportucos_ii.c : Line 58 calloc1 failed, out of heap, size 114688
17:22:24 UT : 08/09/08 : File webportglu.c : Line 99 Error writing to file system. size: 110260, rc: 0, error: -20
17:22:34 UT : 08/09/08 : File src/osportucos_ii.c : Line 58 calloc1 failed, out of heap, size 114688
17:22:34 UT : 08/09/08 : File webportglu.c : Line 99 Error writing to file system. size: 110260, rc: 0, error: -20
17:22:39 UT : 08/09/08 : File src/osportucos_ii.c : Line 58 calloc1 failed, out of heap, size 114688
17:22:39 UT : 08/09/08 : File webportglu.c : Line 99 Error writing to file system. size: 110261, rc: 0, error: -20
17:23:00 UT : 08/09/08 : File src/osportucos_ii.c : Line 58 calloc1 failed, out of heap, size 114688
17:23:00 UT : 08/09/08 : File webportglu.c : Line 99 Error writing to file system. size: 110260, rc: 0, error: -20
17:23:44 UT : 08/09/08 : File src/osportucos_ii.c : Line 58 calloc1 failed, out of heap, size 114688
17:23:44 UT : 08/09/08 : File webportglu.c : Line 99 Error writing to file system. size: 110260, rc: 0, error: -20
17:24:03 UT : 08/09/08 : File src/osportucos_ii.c : Line 58 calloc1 failed, out of heap, size 114688
17:24:03 UT : 08/09/08 : File webportglu.c : Line 99 Error writing to file system. size: 110260, rc: 0, error: -20
17:32:57 UT : 08/09/08 : File src/osportucos_ii.c : Line 58 calloc1 failed, out of heap, size 114688
17:32:57 UT : 08/09/08 : File webportglu.c : Line 99 Error writing to file system. size: 110261, rc: 0, error: -20
17:33:01 UT : 08/09/08 : File src/osportucos_ii.c : Line 58 calloc1 failed, out of heap, size 114688
17:33:01 UT : 08/09/08 : File webportglu.c : Line 99 Error writing to file system. size: 110260, rc: 0, error: -20

We have over 100+ APs in our network and have been working with Canopy since 2003; We have seen this on close to 20 different APs in our network!

We have escalate this issue to Canopy Support but . . . . they are very slow to admit that this is in fact a SERIOUS BUG in 8.2.7.

IF YOU UPGRADED TO 8.2.7, THEN PLEASE CHECK YOUR APs, ETC. CAREFULLY FOR THE SCENARIOS DESCRIBED ABOVE. BACK to 8.2.4 for us!

We have seen the same thing on several 8.2.7 APs as well. Sometimes a remote reboot is required to even access the AP. You are not alone.

Are you seeing this on both P9 and P10 or is it limited to a particular hardware series?

Also, is this bug only affecting APs or even SMs?

This is affecting both P9 and P10 APs

I haven’t seen it (yet) on any of my APs but thanks for the heads up!

Are you seeing it on a particular frequency band or is it across the board?

We have seen this accross the board (2.4Ghz and 5.2Ghz)

I havn’t had any problems. Were you previously on any 8.* release?

Thanks
Vince

8.2.4

I have not seen any issue with this on my 5.7 gear…

I’ve had most of my 900MHz system on 8.2.7 for a few weeks now. I have not experienced this problem…(yet)?

No complaints here.

Yep, I have a single AP site on 8.2.7 for the last few weeks or so. Haven’t seen any issues yet. Still very apprehensive to push it to all the others…

Please keep us up to date netmeister on any new developments.

We did not see this until recently. The firmware was pushed to these AP units not long after the 8.2.7 release date. The first few weeks had no issue.

We had one AP that acted up this week. I received six complaints from customers within half an hour Monday morning. Rebooted the AP and immediately the SM’s were back.

For quite a while now, I have Prizm reboot all our Ap’s every three days at 5:00 AM. This was done initailly to clear out all the re-regs, Luids and stats for each SM.
I have never received a complaint from any customers , so I left the automatic re-boot active and have been doing so for the last six months.

Well, this one AP that acted up Monday is one that I had replaced a couple of weeks ago and never included it in the automatic re-boot schedule.

This leads me to believe that periodic re-boots may mitigate this issue with the AP’s untill Moto gives us another one.

Reay

Any more feedback on this release? Especially from anyone running a 900MHz system?

gcampbell wrote:
Any more feedback on this release? Especially from anyone running a 900MHz system?


It's been solid for me on both my 900MHz and 2.4GHz systems.

Motorola found the memory leak is caused by the LUID select (when you login to a customer radio via AP session page, for instance) so just don’t do that any more than you have to until the next release. Ya, eh?

From what we’ve seen (thousands of sm’s (900, 24, 57) running 8.2.7) this is the best release yet… We don’t log into SM’s using LUID, therefore we haven’t seen this issue.