ePMP 2000 self reboot

Hi! I am running a tower with 4 2000s, and one is constantly rebooting.

Last time it happened a week ago (it was rebooting every 30 minutes). After power cycle, it went ok for a week, Now i see this issue again, here is the crashlog (Firmware is 3.5.1):

O.209>debug crashlog
Time: 1546762557.452708
Modules:        tdd_umac@c0d27000+133c22        tdd_pppoe_ia@c0b63000+30b2      tdd_link_test@c0b54000+100b     tdd_dev@c0afe000+40e92  tdd_spectral@c0a8d000+9da3      tdd_dfs@c0941000+1311e4       tdd_hal@c086d000+a9a00  tdd_acm@c07d5000+1e26   tdd_unblockSA@c07c6000+2b5      tdd_pps@c07b7000+314f   gpslic@c07a6000+46f     tdd_cdf@c0799000+111ctdd_adf@c078a000+26c5    tdd_asf@c077a000+1b93   athrs_gmac@c0734000+9d6f        tdd_netlink_socket@c0456000+20b macvlan@c0449000+1140   nf_conntrack_netlink@c0439000+2820   ebt_ulog@c0428000+1080   ebt_ip@c041b000+520     ebt_arp@c040f000+650    ebt_vlan@c0403000+650   ebt_pkttype@c03f6000+260        ebt_limit@c03ea000+400  ebt_802_3@c03de000+2f0ebtable_nat@c03cc000+3a0        ebtable_filter@c03ba000+3b0     ebtable_broute@c03a8000+300     ebtables@c0399000+3ac5  nfnetlink@c0388000+7df  nf_conntrack_tftp@c036f000+9b0xt_HL@c0355000+560      xt_hl@c0349000+3e0      ipt_ECN@c033d000+580    xt_CLASSIFY@c0331000+240        xt_tcpmss@c0325000+430  xt_statistic@c0319000+380       xt_DSCP@c030d000+5d0  xt_dscp@c0301000+440    xt_quota@c02f5000+340   xt_pkttype@c02e9000+2a0 xt_physdev@c02dd000+590 xt_owner@c02d1000+300   ipt_REDIRECT@c02c5000+2f0       ipt_NETMAP@c02b9000+2f0       ipt_MASQUERADE@c02ad000+440     iptable_nat@c02a1000+9b8        nf_nat@c0292000+2db0    xt_CONNMARK@c0282000+360        xt_recent@c0275000+1600 xt_helper@c0266000+390        xt_conntrack@c025a000+830       xt_connmark@c024d000+2e0        xt_connbytes@c0241000+530       xt_NOTRACK@c0235000+270 iptable_raw@c0229000+2e0        xt_state@c021d000+350 nf_conntrack_ipv4@c0210000+1fb2 nf_defrag_ipv4@c0200000+306     nf_conntrack@c01e9000+a219      pppoe@c01cc000+20c0     pppox@c01bd000+59a      ipt_REJECT@c01b0000+710       xt_TCPMSS@c01a3000+a50  ipt_LOG@c0195000+1110   xt_multiport@c0187000+750       xt_mac@c017b000+2a0     xt_limit@c016f000+440   iptable_mangle@c0163000+430  iptable_filter@c0156000+350      ip_tables@c0147000+2255 xt_tcpudp@c0138000+730  x_tables@c012a000+278e  ppp_async@c0119000+1960 ppp_generic@c0106000+4b55       slhc@c00f1000+112b    ts_fsm@c00e4000+a80     ts_bm@c00d7000+600      ts_kmp@c00cb000+590     crc_ccitt@c00bb000+42b  cambium_iprst@c00a8000+d00      leds_gpio@c0097000+5c0  button_hotplug@c008a000+a70   gpio_buttons@c007d000+880       input_polldev@c0070000+643      input_core@c005e000+469e
viceId1 0x18
<4>16MB Flash Detected
<4>Set ART offset to 0xFF0000
<5>6 cmdlinepart partitions found on MTD device ath-nor0
<5>Creating 6 MTD partitions on "ath-nor0":
<5>0x000000000000-0x000000040000 : "u-boot"
<5>0x000000040000-0x000000050000 : "u-boot-env"
<5>0x000000050000-0x0000000b0000 : "config"
<5>0x0000000b0000-0x000000850000 : "uImageI"
<5>0x000000850000-0x000000ff0000 : "uImage"
<5>0x000000ff0000-0x000001000000 : "ART"
<4>*** ar8216_init ***
<6>TCP westwood registered
<6>NET: Registered protocol family 17
<5>Bridge firewalling registered
<6>802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com>
<6>All bugs added by David S. Miller <davem@redhat.com>
<4>athwdt_init: Registering WDT success
<4>ath_otp_init: Registering OTP success
<4>ath_clksw_init: Registering Clock Switch Interface success
<4>apPx_spi_init: initialize Cambium New SPI Driver....
<6>Boot loader Board SKU<0xA> Linux Board SKU<0xA>
<6>Using Board SKU<0xA>...
<4>uC Firmware.... 0x20
<4>Entering ath_hw_config_init()
<4>GPIO :22 XLNA MUX with:38
<4>Init Hawkeye GPIOs
<4>Configure LO with GPIOs LO_STROBE:16 SPI_CLK:15 SPI_MOSI:13
<4>[hawk_i2s_enable] enter
<4>Configure FPGA with GPIOs FPGA_SPI_CS:4 CRESET_B:12 SPI_CLK:15 SPI_MOSI:13 SPI_MISO:18
<4>Open file:/etc/fpga_bin/MACHX02.sea /etc/fpga_bin/MACHX02.sed
<4>Checking FPGA ID... OK
<6>FPGA already has latest firmware. Continue boot process
<4>Wrong Beam Steer SKU:0 Turn ON sector antenna OFF BSA
<4>apPx_rst_button_init: platform_device_regsiter: hawkeye_keys_device  status = 0
<6>Freeing unused kernel memory: 25172k freed
<6>Button Hotplug driver version 0.3.1
<4>
<4>gpio_keys_probe: Initialize RST_BUTTON Driver....GPIO(23)
<4>gpio_keys_probe:gpio_keys_isr registered to IRQ = 55
<6>input: gpio-keys as /devices/platform/gpio-keys/input/input0
<4>GPS getVerCnt = 2
<6>PPP generic driver version 2.4.2
<6>ip_tables: (C) 2000-2006 Netfilter Core Team
<6>NET: Registered protocol family 24
<4>nf_conntrack version 0.5.0 (1982 buckets, 7928 max)
<4>CONFIG_NF_CT_ACCT is deprecated and will be removed soon. Please use
<4>nf_conntrack.acct=1 kernel parameter, acct=1 nf_conntrack module option or
<4>sysctl net.netfilter.nf_conntrack_acct=1 to enable it.
<4>nf_nat_ftp: Unknown symbol nf_nat_ftp_hook
<4>Netfilter messages via NETLINK v0.30.
<6>Ebtables v2.0 registered
<4>ctnetlink v0.93: registering with nfnetlink.
<4>tdd_netlink_socket: module license 'Proprietary' taints kernel.
<4>Disabling lock debugging due to kernel taint
<4>ATHR_GMAC: Length per segment 1722
<4>ATHR_GMAC: fifo cfg 3 01f00140
<4>ATHR_GMAC: mac number:0
<4>ATHR_GMAC: RX TASKLET - Pkts per Intr:64
<4>ATHR_GMAC: Mac address for unit 0:bfff0000
<4>ATHR_GMAC: 00:04:56:d5:07:65
<4>ATHR_GMAC: Max segments per packet :   1
<4>ATHR_GMAC: Max tx descriptor count :   128
<4>ATHR_GMAC: Max rx descriptor count :   128
<4>ATHR_GMAC: Mac capability flags    :   2381
<4>athr_gmac_ring_alloc Allocated 2048 at 0x859f4000
<4>athr_gmac_ring_alloc Allocated 2048 at 0x859f4800
<4>SCORPION  ----> AR8035 PHY
<4>Setting Drop CRC Errors, Pause Frames and Length Error frames
<6>camb_mii_bus: probed
<4>mac:0 Registering AR8035...
<4>Setting PHY...0
<4>Wait for Autoneg to complete
<4>AR8035_PHY: Port 0, Neg Success
<4>AR8035_PHY: unit 0 phy addr 0 checking for the az feature of 8035...
<4>Disabling 802.3az feature...
<4>Restart auto-negotiation
<4>ATHR_GMAC: UNKNOWN intr eth0 isr 0x80000010 imr 0xd8
<6>device eth0 entered promiscuous mode
<4>TXFCTL enabled in Mac:0
<6>tdd_hal: 0.9.17.1 (AR9380, DEBUG, WRITE_EEPROM, 11D)
<6>tdd_dfs: Version 2.0.0
<6>Copyright (c) 2005-2006 Atheros Communications, Inc. All Rights Reserved
<6>tdd_spectral: Version 2.0.0
<6>Copyright (c) 2005-2009 Atheros Communications, Inc. All Rights Reserved
<6>SPECTRAL module built on Dec  5 2017 13:38:20
<6>tdd_dev: Copyright (c) 2001-2007 Atheros Communications, Inc, All Rights Reserved
<6>ath_ahb: 9.2.0_U10.5.13 (Atheros/multi-bss)
<4>AH_CAL_IN_FLASH_AHB defined
<4>__ath_attach: Set global_scn[0]
<4>TxBuf flow control is disabled
<4>hal_conf_parm.calInFlash 1
<4>Bootstrap clock 40MHz
<4>Enterprise mode: 0x40000000
<4>ar9300RadioAttach: Need analog access recipe!!
<4>Restoring Cal data from Flash
<4>Restoring Cal data from Second Radio in Flash
<4>Allow 5.9 channels: cal peer[7]=5950
<7>dfs_attach: use DFS enhancements
<7>dfs_init_radar_filters: Unknown dfs domain 0
<4>ath_get_caps[6044] rx chainmask mismatch actual 3 sc_chainmak 0
<4>ath_get_caps[6019] tx chainmask mismatch actual 3 sc_chainmak 0
<4>tdd_classifier_init: ic=85ac02c0 classifier=85ad842c
<4>ath_descdma_setup: tx DMA: 1024 buffers 1 desc/buf 128 desc_len
<4>ath_descdma_setup: tx DMA map: 85b40000 (135168) -> 5b40000 (135168)
<4>ath_descdma_setup: success, name = tx, nbuf = 1024
<4>ath_descdma_setup: beacon DMA: 8 buffers 1 desc/buf 128 desc_len
<4>ath_descdma_setup: beacon DMA map: a1481000 (4096) -> 1481000 (4096)
<4>ath_descdma_setup: success, name = beacon, nbuf = 8
<4>ath_rx_edma_init: cachelsz 32 rxbufsize 1816
<6>wifi0: Atheros 9557: mem=0xb8100000, irq=2
<4>TXFCTL enabled in Mac:0
<7>dfs_init_radar_filters: Unknown dfs domain 0
<4>camb_debug_print_setup: type=63, counter=3
<4>wlan_vap_create : enter. devhandle=0x85ac02c0, opmode=IEEE80211_M_HOSTAP, flags=0x1
<4>tdd_control_init
<4>tdd_calc_cell_prop_delay: Cell_Radius_Size = 65500 m, prop_delay = 218484 ns
<4>tdd_calc_cell_prop_delay: Cell_Radius_Size = 96540 m, prop_delay = 322022 ns
<4>tdd_calc_frame_params: FrameD=5000 DL=2241 TTRG=478 UL=2241 RTTG=40 CW=599 max_inar_duration=152000 mcs=96540 STXOP_MUL=2
<4>latency control is ON, latency value is 150000000
<4>wlan_vap_create : exit. devhandle=0x85ac02c0, opmode=IEEE80211_M_HOSTAP, flags=0x1.
<4>_register_mc_bypass_hndl: Register MultiCast bypass [eth0] -> [ath0]
<4>ATHR_GMAC:__set_mc_bypass_ath_dev: Setting MultiCast bypass [eth0] -> [ath0]
<4>ATH_MAC_TIMER: enet unit:0 is up...
<4>RGMii 100Mbps full duplex
<4>ATH_MAC_TIMER: done cfg2 0x7115 ifctl 0x10000 miictrl
<6>br-lan: port 1(eth0) entering forwarding state
<4>SWB: CAC time = 67000 MILLIseconds
<4>SWB: NOP time = 1800000 MILLIseconds
<4>50, 30, 20
<4>SWB: validated channel = 186 (5930MHz)
<4>SWB: Primary channel = 186 (5930 MHz)
<4>Set TDD_CLOUD_MGMT: 1
<4>Setting Max Stations:60
<4>unable to setup uplink factorization for GPS mode
<6>PPS Stabilizer is active
<6>Clock drift detector is active
<6>Clock drift compensator is active
<6>Frame Duration Controller is active
<6>1PPS Sync Manager is active
<6>Sync Multiplexor is active
<3>GPS Sync Lost. (22:00:06:449388)
<4>SSID lengh is ZERO do not start AP if no ssid is set
<6>device ath0 entered promiscuous mode
<6>br-lan: port 2(ath0) entering forwarding state
<4> ieee80211_ioctl_siwmode: imr.ifm_active=0x450280, new mode=0x3, valid=1
<4>
<4>Checking for node leak: OK
<6>br-lan: port 2(ath0) entering disabled state
<4>_hbs_starting_tmr: Invalid BSA SKU 0x0
<4>try to stop bss
<4>done
<4>Current channel: 186 (5930 MHz)
<4>try to start bss...
<4>Setting Max Stations:60
<6>TDD Sync Enable called without disable
<4>Flags 0, Scheduler Mode 0
<7>dfs_re_init_radar_filters: Unknown dfs domain 0
<7>dfs_correction_4radar_filters: Unknown dfs domain 0
<4>done
<4>interface is UP
<6>br-lan: port 2(ath0) entering forwarding state
<4>Wait for Autoneg to complete
<4>AR8035_PHY: Port 0, Neg Success
<4>AR8035_PHY: unit 0 phy addr 0 ATH_MAC_TIMER: unit 0: phy 0 not up carrier 1
<6>br-lan: port 1(eth0) entering disabled state
<3>GPS Sync Restored. (22:00:11:774130)
<4>TSF alignment: 1
<4>ATH_MAC_TIMER: enet unit:0 is up...
<4>RGMii 100Mbps full duplex
<4>ATH_MAC_TIMER: done cfg2 0x7115 ifctl 0x10000 miictrl
<6>br-lan: port 1(eth0) entering forwarding state
<4>**** drop_caches_sysctl_handler: all done timer added ...****
<4>[  183.110000] tdd_ratectrl_stat_add: suspicious ARQ statistic, ARQ fn = 33394, RA last FN = 0
<4>[  214.340000] SM[44:d9:e7:de:38:79] aid=23 disassociated. Reason: COMMUNICATION LOST
<4>[  252.450000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<7>[  252.460000]       CHECKSUM fail $GPGSV,3,3,12,03,24,273,30,31,20,131,36,10,17,60,27,69,31,249,24*6E    chksum = 4C chksumstr = 4C  len= 68
<4>[  387.850000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[  749.720000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[  918.890000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 1037.140000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 1141.940000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 1216.510000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 1602.790000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 1935.400000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 2183.760000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 2309.180000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 2587.650000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 3149.090000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 3230.190000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 3386.300000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 3788.390000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 4028.320000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 5167.130000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 5423.990000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 5585.860000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 6331.090000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 6709.330000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 7500.710000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 7641.620000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 7713.750000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 8197.060000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 8339.790000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 9311.910000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[ 9738.310000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[10490.070000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[10936.650000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[11099.340000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[11183.040000] SM[78:8a:20:a2:cd:74] aid=11 disassociated. Reason: COMMUNICATION LOST
<4>[11764.280000] Unhandled kernel unaligned access[#1]:
<4>[11764.280000] Cpu 0
<4>[11764.280000] $ 0   : 00000000 00000000 85ad02c0 00000006
<4>[11764.280000] $ 4   : 00000000 4f48b8aa 00000000 00000008
<4>[11764.280000] $ 8   : 00000000 8017bcd0 00000001 00000000
<4>[11764.280000] $12   : 00000000 0000003b 00000000 0ee6b286
<4>[11764.280000] $16   : 00000012 00000000 0000001a 85ac02c0
<4>[11764.280000] $20   : 85ace5bc 00000000 00000000 00000000
<4>[11764.280000] $24   : 00000000 80086540
<4>[11764.280000] $28   : 802c2000 802c3c70 00000003 c0d370c0
<4>[11764.280000] Hi    : 0000000b
<4>[11764.280000] Lo    : 00000012
<4>[11764.280000] epc   : c0d36ef8 tdd_maptx_get_max_gpf_duration+0x104/0x3f8 [tdd_umac]
<4>[11764.280000]     Tainted: P
<4>[11764.280000] ra    : c0d370c0 tdd_maptx_get_max_gpf_duration+0x2cc/0x3f8 [tdd_umac]
<4>[11764.280000] Status: 1100ff03    KERNEL EXL IE
<4>[11764.280000] Cause : 00800010
<4>[11764.280000] BadVA : 4f48b9be
<4>[11764.280000] PrId  : 00019750 (MIPS 74Kc)
<4>[11764.280000] Modules linked in: tdd_umac tdd_pppoe_ia(P) tdd_link_test(P) tdd_dev(P) tdd_spectral(P) tdd_dfs(P) tdd_hal(P) tdd_acm(P) tdd_unblockSA(P) tdd_pps(P) gpslic tdd_cdf tdd_adf tdd_asf(P) athrs_gmac tdd_netlink_socket(P) macvlan nf_conntrack_netlink ebt_ulog ebt_ip ebt_arp ebt_vlan ebt_pkttype ebt_limit ebt_802_3 ebtable_nat ebtable_filter ebtable_broute ebtables nfnetlink nf_conntrack_tftp xt_HL xt_hl ipt_ECN xt_CLASSIFY xt_tcpmss xt_statistic xt_DSCP xt_dscp xt_quota xt_pkttype xt_physdev xt_owner ipt_REDIRECT ipt_NETMAP ipt_MASQUERADE iptable_nat nf_nat xt_CONNMARK xt_recent xt_helper xt_conntrack xt_connmark xt_connbytes xt_NOTRACK iptable_raw xt_state nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack pppoe pppox ipt_REJECT xt_TCPMSS ipt_LOG xt_multiport xt_mac xt_limit iptable_mangle iptable_filter ip_tables xt_tcpudp x_tables ppp_async ppp_generic slhc ts_fsm ts_bm ts_kmp crc_ccitt cambium_iprst leds_gpio button_hotplug gpio_buttons input_polldev input_core [last unloaded: nf_nat_tftp]
<4>[11764.280000] Process swapper (pid: 0, threadinfo=802c2000, task=802c4b40, tls=00000000)
<4>[11764.280000] Stack : 85ace4bc 878d2000 000000e0 0000001c 85ace4bc c0d57a20 0023e6d0 c078a4d4
<4>[11764.280000]         802c3ca0 103723ca 0000001c 8009f614 00000fa0 85ac02c0 001243c8 878d2000
<4>[11764.280000]         85ac02c0 00000ab3 00000000 00000000 00000000 c0d290ec 878d2000 85ae02c0
<4>[11764.280000]         802c9814 c0d4ebfc 85ac02c0 00000000 85ae02c0 802c9814 85ac02c0 c0d2a8c8
<4>[11764.280000]         00002df4 10b74ff4 00000001 802c3d08 004c4b40 00386968 001243c8 00386968
<4>[11764.280000]         ...
<4>[11764.280000] Call Trace:
<4>[11764.280000] [<c0d36ef8>] tdd_maptx_get_max_gpf_duration+0x104/0x3f8 [tdd_umac]
<4>[11764.280000] [<c0d290ec>] tdd_scheduler_init_ap+0xf4/0x158 [tdd_umac]
<4>[11764.280000] [<c0d2a8c8>] ath_tdd_scheduler_start_ap+0x1a8/0x17c4 [tdd_umac]
<4>[11764.280000] [<c0b07c40>] ath_tdd_sta_ul_stop_timer_handler+0xd0/0x47c [tdd_dev]
<4>[11764.280000]
<4>[11764.280000]
<4>[11764.280000] Code: 904643f8  10a00005  0064b80b <8ca50114> 24c20001  14a0fffd  304600ff  2cc2000b  14400085
<0>[11764.550000] Kernel panic - not syncing: Fatal exception in interrupt
O.209>

Hi,

What FW version AP is running?

Thank you.

I believe he said in the original post that it is 3.5.1

1 Like

Right. I 'm sorry, I've missed that.

It makes sense to try the latest 3.5.5 FW.

Thnak you.

Thank you, I will try it. It has been 2 weeks since I have brought its cable into a patch panel, and a new power supply,and uploaded 3.5.2 firmware, and it was Ok, but now i see 20mins uptime again.

Will try 3.5.5 and let you know