Modify

Opened 5 years ago

Last modified 22 months ago

#13681 reopened defect

ar71xx: wr741nd wifi crash after some time

Reported by: musti@… Owned by: developers
Priority: high Milestone: Barrier Breaker 14.07
Component: packages Version: Trunk
Keywords: Cc:

Description

Wifi connection on some wr741nd routers dies after a few days, this happens on less then 1% of 100+ nodes I am running, depends on the location or the specific hardware router.

The symptoms are simply: no active connections in iw wlan0 station dump, either mesh or AP. The SSID as far as I was able to determine is sometimes visible but unable to connect, else invisible.

As requested in #13156 attaching debug outputs from a node where wifi has crashed.

The problem is present in AA 12.09 and trunk

/sys/kernel/debug/ieee80211/phy0/ath9k:
base_eeprom
modal_eeprom
misc
interrupt
recv
xmit
reset
queues

Attachments (6)

gerbiceva-49-wificrash-dump.tar.gz (2.3 KB) - added by musti@… 5 years ago.
dump of router where wifi has crashed - wr741nd
tdk-13-test-wificrashdump.tar.gz (2.2 KB) - added by musti@… 5 years ago.
wr741nd v4 crash after about 5h of video stream at 2Mbps
debug-info.tar.gz (188 bytes) - added by me@… 5 years ago.
ar71xx wifi debug info after wifi crash
fri-wificrashdump.tar.gz (2.4 KB) - added by musti@… 4 years ago.
>Crash report from wr741nd v4
WasabiNet-cacao-2014-03-20-2240-r39928.tar.gz (3.7 KB) - added by ben@… 4 years ago.
Dump of /sys/kernel/debug/ieee80211/phy0/ath9k/ from UBNT Nano Loco M2 running AA r39928
301-aggregation-tx-lockup.patch (1.3 KB) - added by ben@… 4 years ago.
Backport of trunk changeset 41815 to AA

Download all attachments as: .zip

Change History (52)

Changed 5 years ago by musti@…

dump of router where wifi has crashed - wr741nd

comment:1 Changed 5 years ago by mmitar@…

CCing.

Changed 5 years ago by musti@…

wr741nd v4 crash after about 5h of video stream at 2Mbps

comment:2 follow-up: Changed 5 years ago by k@…

CCing.

Changed 5 years ago by me@…

ar71xx wifi debug info after wifi crash

comment:3 in reply to: ↑ 2 Changed 5 years ago by me@…

I confirm this bug on ar71xx wr720n router. Wifi ssid stop broadcasting one or two days later after reboot. Reboot or "iwlist wlan0 scan", ssid broadcast again.

root@Xiaofu-Router:~# cat /etc/config/wireless 

config wifi-device 'radio0'
        option type 'mac80211'
        option hwmode '11ng'
        option path 'platform/ar933x_wmac'
        list ht_capab 'SHORT-GI-20'
        list ht_capab 'SHORT-GI-40'
        list ht_capab 'RX-STBC1'
        list ht_capab 'DSSS_CCK-40'
        option txpower '27'
        option noscan '1'
        option country 'US'
        option htmode 'HT40-'
        option disabled '0'
        option channel '8'

config wifi-iface
        option device 'radio0'
        option mode 'ap'
        option ssid 'Xiaofu-wifi'
        option network 'lan'
        option wmm '0'
        option encryption 'psk2+ccmp'
        option key '******'

comment:4 Changed 5 years ago by kisssandoradam@…

This bug still exist in the TP-LINK TL-WR-1043ND routers too. My router is 1043ND v1.8 and i always have to reboot the router to get wifi back. Usually 3-4 times a week. It's a disaster, noone can tell me how to fix this or what is the main problem with them, just saying "please try latest trunk". It doesn't help. The bug is not fixed. (It's already exist in the official firmware too.) I'm really thinking on selling the router and buy and other one that has stable wifi (even if it doesn't run custom firmwares).

comment:5 Changed 4 years ago by nbd

please try latest trunk or AA

comment:6 Changed 4 years ago by musti@…

The problem appears to be partially resolved, seems to be less often on some nodes. However since this patch (upgrade to latest AA) the wifi performance is worse, 1480 byte packet loss has increased since the upgrade https://nodes.wlan-si.net/graphs/25843/month/ Observed on all the nodes with the upgrade.

comment:7 Changed 4 years ago by Mitar

Are you sure this packets go over WiFi? They are measured from central location, so it can be that there is packet loss on other links to the node. Please test this independently. Or at least compare between graphs of only the last WiFi path segment.

(BTW, you should attach the image as it changes through time.)

comment:8 Changed 4 years ago by xuming_lee@…

latest AA, router: wr703n, problem still remain. debug info:

------------------ ani ------------------------------
            ANI: ENABLED
      ANI RESET: 5
        SPUR UP: 15954
      SPUR DOWN: 15954
 OFDM WS-DET ON: 0
OFDM WS-DET OFF: 0
     MRC-CCK ON: 0
    MRC-CCK OFF: 0
    FIR-STEP UP: 1064
  FIR-STEP DOWN: 1063
 INV LISTENTIME: 0
    OFDM ERRORS: 70010984
     CCK ERRORS: 7919411

------------------ end ------------------------------

------------------ base_eeprom ------------------------------
      EEPROM Version :          2
          RegDomain1 :          0
          RegDomain2 :         31
             TX Mask :          1
             RX Mask :          1
          Allow 5GHz :          0
          Allow 2GHz :          1
   Disable 2GHz HT20 :          0
   Disable 2GHz HT40 :          0
   Disable 5Ghz HT20 :          0
   Disable 5Ghz HT40 :          0
          Big Endian :          0
           RF Silent :          0
           BT option :          0
          Device Cap :          0
         Device Type :          4
  Power Table Offset :          0
        Tuning Caps1 :         96
        Tuning Caps2 :          0
 Enable Tx Temp Comp :          1
 Enable Tx Volt Comp :          0
   Enable fast clock :          1
     Enable doubling :          1
  Internal regulator :          1
        Enable Paprd :          1
     Driver Strength :          0
          Quick Drop :          0
   Chain mask Reduce :          0
   Write enable Gpio :          3
   WLAN Disable Gpio :          0
       WLAN LED Gpio :          8
 Rx Band Select Gpio :        255
             Tx Gain :          1
             Rx Gain :          1
              SW Reg :          0
          MacAddress : 00:03:7f:be:f1:f5

------------------ end ------------------------------

------------------ chanbw ------------------------------
0x00000000

------------------ end ------------------------------

------------------ diag ------------------------------
0x00000000

------------------ end ------------------------------

------------------ diversity ------------------------------
0

------------------ end ------------------------------

------------------ dma ------------------------------
Raw DMA Debug values:

0: 88888888 1: 00000000 2: 12249249 3: 00000000 
4: 00000000 5: 00000000 6: 0007274c 7: 00028000 

Num QCU: chain_st fsp_ok fsp_st DCU: chain_st
 0           0      1      1            0
 1           0      1      1            0
 2           0      1      1            0
 3           0      1      1            0
 4           0      1      1            0
 5           0      1      1            0
 6           0      1      1            0
 7           0      1      1            0
 8           0      0      1            0
 9           0      0      1            0

qcu_stitch state:    0    qcu_fetch state:         0
qcu_complete state:  0    dcu_complete state:      0
dcu_arb state:       0    dcu_fp state:            0
chan_idle_dur:     211    chan_idle_dur_valid:     1
txfifo_valid_0:      0    txfifo_valid_1:          0
txfifo_dcu_num_0:    9    txfifo_dcu_num_1:        3
pcu observe: 0x2880
AR_CR: 0xc

------------------ end ------------------------------

------------------ dump_nfcal ------------------------------
Channel Noise Floor : -95
Chain | privNF | # Readings | NF Readings
 0       -118    5               -118 -118 -118 -118 -118

------------------ end ------------------------------

------------------ gpio_mask ------------------------------
0

------------------ end ------------------------------

------------------ gpio_val ------------------------------
0

------------------ end ------------------------------

------------------ ignore_extcca ------------------------------
N

------------------ end ------------------------------

------------------ interrupt ------------------------------
                 RXLP:    2864985
                 RXHP:          0
              WATHDOG:          0
                RXEOL:          5
                RXORN:          0
                   TX:    2129512
                TXURN:          0
                  MIB:          0
                RXPHY:          0
                RXKCM:          0
                 SWBA:    6719230
                BMISS:          0
                  BNR:          0
                  CST:       1521
                  GTT:        229
                  TIM:          0
               CABEND:          0
             DTIMSYNC:          0
                 DTIM:          0
               TSFOOR:          0
                  MCI:          0
             GENTIMER:          0
                TOTAL:   11661525
SYNC_CAUSE stats:
             Sync-All:          0
              RTC-IRQ:          0
              MAC-IRQ:          0
EEPROM-Illegal-Access:          0
          APB-Timeout:          0
    PCI-Mode-Conflict:          0
          HOST1-Fatal:          0
           HOST1-Perr:          0
       TRCV-FIFO-Perr:          0
          RADM-CPL-EP:          0
  RADM-CPL-DLLP-Abort:          0
   RADM-CPL-TLP-Abort:          0
    RADM-CPL-ECRC-Err:          0
     RADM-CPL-Timeout:          0
    Local-Bus-Timeout:          0
            PM-Access:          0
            MAC-Awake:          0
           MAC-Asleep:          0
     MAC-Sleep-Access:          0

------------------ end ------------------------------

------------------ misc ------------------------------
BSSID: 00:00:00:00:00:00
BSSID-MASK: fd:ff:ff:ff:ff:ff
OPMODE: AP
RXFILTER: 0xc497 UCAST MCAST BCAST BEACON PROBEREQ COMP_BAR PSPOLL MCAST_BCAST_ALL
INTERRUPT-MASK: 0xf0010473 SWBA CST RX RXHP RXLP BB_WATCHDOG
VIF-COUNTS: AP: 2 STA: 0 MESH: 0 WDS: 0 ADHOC: 0 TOTAL: 2 BEACON-VIF: 2

------------------ end ------------------------------

------------------ modal_eeprom ------------------------------
   2GHz modal Header :
 Chain0 Ant. Control :        336
 Chain1 Ant. Control :        336
 Chain2 Ant. Control :        336
 Ant. Common Control :        272
Ant. Common Control2 :     139810
           Ant. Gain :          0
       Switch Settle :         44
    Chain0 xatten1DB :          0
    Chain1 xatten1DB :          0
    Chain2 xatten1DB :          0
Chain0 xatten1Margin :          0
Chain1 xatten1Margin :          0
Chain2 xatten1Margin :          0
          Temp Slope :         40
          Volt Slope :          0
      spur Channels0 :        164
      spur Channels1 :          0
      spur Channels2 :          0
      spur Channels3 :          0
      spur Channels4 :          0
 Chain0 NF Threshold :         -1
 Chain1 NF Threshold :          0
 Chain2 NF Threshold :          0
          Quick Drop :          0
       txEndToXpaOff :          0
      xPA Bias Level :          0
  txFrameToDataStart :         14
       txFrameToPaOn :         14
      txFrameToXpaOn :         14
              txClip :          3
    ADC Desired size :        -30
   5GHz modal Header :
 Chain0 Ant. Control :          0
 Chain1 Ant. Control :          0
 Chain2 Ant. Control :          0
 Ant. Common Control :        272
Ant. Common Control2 :     139810
           Ant. Gain :          0
       Switch Settle :         45
    Chain0 xatten1DB :          0
    Chain1 xatten1DB :          0
    Chain2 xatten1DB :          0
Chain0 xatten1Margin :          0
Chain1 xatten1Margin :          0
Chain2 xatten1Margin :          0
          Temp Slope :         68
          Volt Slope :          0
      spur Channels0 :          0
      spur Channels1 :          0
      spur Channels2 :          0
      spur Channels3 :          0
      spur Channels4 :          0
 Chain0 NF Threshold :         -1
 Chain1 NF Threshold :          0
 Chain2 NF Threshold :          0
          Quick Drop :          0
       txEndToXpaOff :          0
      xPA Bias Level :          0
  txFrameToDataStart :         14
       txFrameToPaOn :         14
      txFrameToXpaOn :         14
              txClip :          3
    ADC Desired size :        -30

------------------ end ------------------------------

------------------ paprd ------------------------------
N

------------------ end ------------------------------

------------------ qlen_be ------------------------------
123

------------------ end ------------------------------

------------------ qlen_bk ------------------------------
123

------------------ end ------------------------------

------------------ qlen_vi ------------------------------
123

------------------ end ------------------------------

------------------ qlen_vo ------------------------------
123

------------------ end ------------------------------

------------------ queues ------------------------------
(VO):  qnum: 0 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0
(VI):  qnum: 1 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0
(BE):  qnum: 2 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0
(BK):  qnum: 3 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0
(CAB): qnum: 8 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0

------------------ end ------------------------------

------------------ recv ------------------------------
               CRC ERR :     390105
       DECRYPT CRC ERR :          0
               PHY ERR :      24778
               MIC ERR :          0
     PRE-DELIM CRC ERR :          0
    POST-DELIM CRC ERR :      67897
      DECRYPT BUSY ERR :          0
         RX-LENGTH-ERR :          0
            RX-OOM-ERR :          0
           RX-RATE-ERR :          0
     RX-TOO-MANY-FRAGS :          0
          UNDERRUN ERR :          0
            TIMING ERR :          0
            PARITY ERR :          0
              RATE ERR :          0
            LENGTH ERR :          0
             RADAR ERR :          0
           SERVICE ERR :          0
               TOR ERR :          0
       OFDM-TIMING ERR :          0
OFDM-SIGNAL-PARITY ERR :          0
         OFDM-RATE ERR :          0
       OFDM-LENGTH ERR :          0
   OFDM-POWER-DROP ERR :          0
      OFDM-SERVICE ERR :          0
      OFDM-RESTART ERR :       2482
   FALSE-RADAR-EXT ERR :          0
        CCK-TIMING ERR :          0
    CCK-HEADER-CRC ERR :          0
          CCK-RATE ERR :          0
       CCK-SERVICE ERR :          0
       CCK-RESTART ERR :      22296
        CCK-LENGTH ERR :          0
    CCK-POWER-DROP ERR :          0
            HT-CRC ERR :          0
         HT-LENGTH ERR :          0
           HT-RATE ERR :          0
           RX-Pkts-All :    2938963
          RX-Bytes-All :  903117797
            RX-Beacons :    2264704
              RX-Frags :      25936
           RX-Spectral :          0

------------------ end ------------------------------

------------------ reset ------------------------------
    Baseband Hang:  0
Baseband Watchdog:  0
   Fatal HW Error:  0
      TX HW error:  0
     TX Path Hang:  0
      PLL RX Hang:  0
        MCI Reset:  0

------------------ end ------------------------------

------------------ rx_chainmask ------------------------------
0x00000001

------------------ end ------------------------------

------------------ spectral_count ------------------------------
8

------------------ end ------------------------------

------------------ spectral_fft_period ------------------------------
15

------------------ end ------------------------------

------------------ spectral_period ------------------------------
255

------------------ end ------------------------------

------------------ spectral_scan0 ------------------------------

------------------ end ------------------------------

------------------ spectral_scan_ctl ------------------------------
disable
------------------ end ------------------------------

------------------ spectral_short_repeat ------------------------------
1

------------------ end ------------------------------

------------------ tx_chainmask ------------------------------
0x00000001

------------------ end ------------------------------

------------------ xmit ------------------------------
                            BE         BK        VI        VO

MPDUs Queued:             2140         14        30    146404
MPDUs Completed:          3878         51       122    145804
MPDUs XRetried:             87          0         0      1017
Aggregates:               1672          0        26         0
AMPDUs Queued HW:            0          0         0         0
AMPDUs Queued SW:       303552         82      2728       417
AMPDUs Completed:       301587         45      2636         0
AMPDUs Retried:           1083          0        11         0
AMPDUs XRetried:           140          0         0         0
TXERR Filtered:             23          0         0         0
FIFO Underrun:               0          0         0         0
TXOP Exceeded:               0          0         0         0
TXTIMER Expiry:              0          0         0         0
DESC CFG Error:              0          0         0         0
DATA Underrun:               0          0         0         0
DELIM Underrun:              0          0         0         0
TX-Pkts-All:            305692         96      2758    146821
TX-Bytes-All:        269254863      19219   3460387  24237487
HW-put-tx-buf:          303054         96      2723    146667
HW-tx-start:                 0          0         0         0
HW-tx-proc-desc:        303054         96      2723    146821
TX-Failed:                   0          0         0         0

------------------ end ------------------------------

debug again after while.

------------------ ani ------------------------------
            ANI: ENABLED
      ANI RESET: 5
        SPUR UP: 16050
      SPUR DOWN: 16050
 OFDM WS-DET ON: 0
OFDM WS-DET OFF: 0
     MRC-CCK ON: 0
    MRC-CCK OFF: 0
    FIR-STEP UP: 1160
  FIR-STEP DOWN: 1159
 INV LISTENTIME: 0
    OFDM ERRORS: 70860486
     CCK ERRORS: 7921046

------------------ end ------------------------------

------------------ base_eeprom ------------------------------
      EEPROM Version :          2
          RegDomain1 :          0
          RegDomain2 :         31
             TX Mask :          1
             RX Mask :          1
          Allow 5GHz :          0
          Allow 2GHz :          1
   Disable 2GHz HT20 :          0
   Disable 2GHz HT40 :          0
   Disable 5Ghz HT20 :          0
   Disable 5Ghz HT40 :          0
          Big Endian :          0
           RF Silent :          0
           BT option :          0
          Device Cap :          0
         Device Type :          4
  Power Table Offset :          0
        Tuning Caps1 :         96
        Tuning Caps2 :          0
 Enable Tx Temp Comp :          1
 Enable Tx Volt Comp :          0
   Enable fast clock :          1
     Enable doubling :          1
  Internal regulator :          1
        Enable Paprd :          1
     Driver Strength :          0
          Quick Drop :          0
   Chain mask Reduce :          0
   Write enable Gpio :          3
   WLAN Disable Gpio :          0
       WLAN LED Gpio :          8
 Rx Band Select Gpio :        255
             Tx Gain :          1
             Rx Gain :          1
              SW Reg :          0
          MacAddress : 00:03:7f:be:f1:f5

------------------ end ------------------------------

------------------ chanbw ------------------------------
0x00000000

------------------ end ------------------------------

------------------ diag ------------------------------
0x00000000

------------------ end ------------------------------

------------------ diversity ------------------------------
0

------------------ end ------------------------------

------------------ dma ------------------------------
Raw DMA Debug values:

0: 88888888 1: 00000000 2: 12249249 3: 00000000 
4: 00000000 5: 00000000 6: 00072448 7: 00028000 

Num QCU: chain_st fsp_ok fsp_st DCU: chain_st
 0           0      1      1            0
 1           0      1      1            0
 2           0      1      1            0
 3           0      1      1            0
 4           0      1      1            0
 5           0      1      1            0
 6           0      1      1            0
 7           0      1      1            0
 8           0      0      1            0
 9           0      0      1            0

qcu_stitch state:    0    qcu_fetch state:         0
qcu_complete state:  0    dcu_complete state:      0
dcu_arb state:       0    dcu_fp state:            0
chan_idle_dur:      18    chan_idle_dur_valid:     1
txfifo_valid_0:      0    txfifo_valid_1:          0
txfifo_dcu_num_0:    9    txfifo_dcu_num_1:        3
pcu observe: 0x2880
AR_CR: 0xc

------------------ end ------------------------------

------------------ dump_nfcal ------------------------------
Channel Noise Floor : -95
Chain | privNF | # Readings | NF Readings
 0       -118    5               -118 -118 -118 -118 -118

------------------ end ------------------------------

------------------ gpio_mask ------------------------------
0

------------------ end ------------------------------

------------------ gpio_val ------------------------------
0

------------------ end ------------------------------

------------------ ignore_extcca ------------------------------
N

------------------ end ------------------------------

------------------ interrupt ------------------------------
                 RXLP:    2865255
                 RXHP:          0
              WATHDOG:          0
                RXEOL:          5
                RXORN:          0
                   TX:    2139446
                TXURN:          0
                  MIB:          0
                RXPHY:          0
                RXKCM:          0
                 SWBA:    6758956
                BMISS:          0
                  BNR:          0
                  CST:       1521
                  GTT:        229
                  TIM:          0
               CABEND:          0
             DTIMSYNC:          0
                 DTIM:          0
               TSFOOR:          0
                  MCI:          0
             GENTIMER:          0
                TOTAL:   11711453
SYNC_CAUSE stats:
             Sync-All:          0
              RTC-IRQ:          0
              MAC-IRQ:          0
EEPROM-Illegal-Access:          0
          APB-Timeout:          0
    PCI-Mode-Conflict:          0
          HOST1-Fatal:          0
           HOST1-Perr:          0
       TRCV-FIFO-Perr:          0
          RADM-CPL-EP:          0
  RADM-CPL-DLLP-Abort:          0
   RADM-CPL-TLP-Abort:          0
    RADM-CPL-ECRC-Err:          0
     RADM-CPL-Timeout:          0
    Local-Bus-Timeout:          0
            PM-Access:          0
            MAC-Awake:          0
           MAC-Asleep:          0
     MAC-Sleep-Access:          0

------------------ end ------------------------------

------------------ misc ------------------------------
BSSID: 00:00:00:00:00:00
BSSID-MASK: fd:ff:ff:ff:ff:ff
OPMODE: AP
RXFILTER: 0xc497 UCAST MCAST BCAST BEACON PROBEREQ COMP_BAR PSPOLL MCAST_BCAST_ALL
INTERRUPT-MASK: 0xf0010473 SWBA CST RX RXHP RXLP BB_WATCHDOG
VIF-COUNTS: AP: 2 STA: 0 MESH: 0 WDS: 0 ADHOC: 0 TOTAL: 2 BEACON-VIF: 2

------------------ end ------------------------------

------------------ modal_eeprom ------------------------------
   2GHz modal Header :
 Chain0 Ant. Control :        336
 Chain1 Ant. Control :        336
 Chain2 Ant. Control :        336
 Ant. Common Control :        272
Ant. Common Control2 :     139810
           Ant. Gain :          0
       Switch Settle :         44
    Chain0 xatten1DB :          0
    Chain1 xatten1DB :          0
    Chain2 xatten1DB :          0
Chain0 xatten1Margin :          0
Chain1 xatten1Margin :          0
Chain2 xatten1Margin :          0
          Temp Slope :         40
          Volt Slope :          0
      spur Channels0 :        164
      spur Channels1 :          0
      spur Channels2 :          0
      spur Channels3 :          0
      spur Channels4 :          0
 Chain0 NF Threshold :         -1
 Chain1 NF Threshold :          0
 Chain2 NF Threshold :          0
          Quick Drop :          0
       txEndToXpaOff :          0
      xPA Bias Level :          0
  txFrameToDataStart :         14
       txFrameToPaOn :         14
      txFrameToXpaOn :         14
              txClip :          3
    ADC Desired size :        -30
   5GHz modal Header :
 Chain0 Ant. Control :          0
 Chain1 Ant. Control :          0
 Chain2 Ant. Control :          0
 Ant. Common Control :        272
Ant. Common Control2 :     139810
           Ant. Gain :          0
       Switch Settle :         45
    Chain0 xatten1DB :          0
    Chain1 xatten1DB :          0
    Chain2 xatten1DB :          0
Chain0 xatten1Margin :          0
Chain1 xatten1Margin :          0
Chain2 xatten1Margin :          0
          Temp Slope :         68
          Volt Slope :          0
      spur Channels0 :          0
      spur Channels1 :          0
      spur Channels2 :          0
      spur Channels3 :          0
      spur Channels4 :          0
 Chain0 NF Threshold :         -1
 Chain1 NF Threshold :          0
 Chain2 NF Threshold :          0
          Quick Drop :          0
       txEndToXpaOff :          0
      xPA Bias Level :          0
  txFrameToDataStart :         14
       txFrameToPaOn :         14
      txFrameToXpaOn :         14
              txClip :          3
    ADC Desired size :        -30

------------------ end ------------------------------

------------------ paprd ------------------------------
N

------------------ end ------------------------------

------------------ qlen_be ------------------------------
123

------------------ end ------------------------------

------------------ qlen_bk ------------------------------
123

------------------ end ------------------------------

------------------ qlen_vi ------------------------------
123

------------------ end ------------------------------

------------------ qlen_vo ------------------------------
123

------------------ end ------------------------------

------------------ queues ------------------------------
(VO):  qnum: 0 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0
(VI):  qnum: 1 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0
(BE):  qnum: 2 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0
(BK):  qnum: 3 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0
(CAB): qnum: 8 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0

------------------ end ------------------------------

------------------ recv ------------------------------
               CRC ERR :     390375
       DECRYPT CRC ERR :          0
               PHY ERR :      24788
               MIC ERR :          0
     PRE-DELIM CRC ERR :          0
    POST-DELIM CRC ERR :      67897
      DECRYPT BUSY ERR :          0
         RX-LENGTH-ERR :          0
            RX-OOM-ERR :          0
           RX-RATE-ERR :          0
     RX-TOO-MANY-FRAGS :          0
          UNDERRUN ERR :          0
            TIMING ERR :          0
            PARITY ERR :          0
              RATE ERR :          0
            LENGTH ERR :          0
             RADAR ERR :          0
           SERVICE ERR :          0
               TOR ERR :          0
       OFDM-TIMING ERR :          0
OFDM-SIGNAL-PARITY ERR :          0
         OFDM-RATE ERR :          0
       OFDM-LENGTH ERR :          0
   OFDM-POWER-DROP ERR :          0
      OFDM-SERVICE ERR :          0
      OFDM-RESTART ERR :       2492
   FALSE-RADAR-EXT ERR :          0
        CCK-TIMING ERR :          0
    CCK-HEADER-CRC ERR :          0
          CCK-RATE ERR :          0
       CCK-SERVICE ERR :          0
       CCK-RESTART ERR :      22296
        CCK-LENGTH ERR :          0
    CCK-POWER-DROP ERR :          0
            HT-CRC ERR :          0
         HT-LENGTH ERR :          0
           HT-RATE ERR :          0
           RX-Pkts-All :    2939391
          RX-Bytes-All :  903629692
            RX-Beacons :    2264715
              RX-Frags :      26073
           RX-Spectral :          0

------------------ end ------------------------------

------------------ reset ------------------------------
    Baseband Hang:  0
Baseband Watchdog:  0
   Fatal HW Error:  0
      TX HW error:  0
     TX Path Hang:  0
      PLL RX Hang:  0
        MCI Reset:  0

------------------ end ------------------------------

------------------ rx_chainmask ------------------------------
0x00000001

------------------ end ------------------------------

------------------ spectral_count ------------------------------
8

------------------ end ------------------------------

------------------ spectral_fft_period ------------------------------
15

------------------ end ------------------------------

------------------ spectral_period ------------------------------
255

------------------ end ------------------------------

------------------ spectral_scan0 ------------------------------

------------------ end ------------------------------

------------------ spectral_scan_ctl ------------------------------
disable
------------------ end ------------------------------

------------------ spectral_short_repeat ------------------------------
1

------------------ end ------------------------------

------------------ tx_chainmask ------------------------------
0x00000001

------------------ end ------------------------------

------------------ xmit ------------------------------
                            BE         BK        VI        VO

MPDUs Queued:             2140         14        30    146407
MPDUs Completed:          3878         51       122    145804
MPDUs XRetried:             87          0         0      1020
Aggregates:               1672          0        26         0
AMPDUs Queued HW:            0          0         0         0
AMPDUs Queued SW:       303552         82      2728       417
AMPDUs Completed:       301587         45      2636         0
AMPDUs Retried:           1083          0        11         0
AMPDUs XRetried:           140          0         0         0
TXERR Filtered:             23          0         0         0
FIFO Underrun:               0          0         0         0
TXOP Exceeded:               0          0         0         0
TXTIMER Expiry:              0          0         0         0
DESC CFG Error:              0          0         0         0
DATA Underrun:               0          0         0         0
DELIM Underrun:              0          0         0         0
TX-Pkts-All:            305692         96      2758    146824
TX-Bytes-All:        269254863      19219   3460387  24237565
HW-put-tx-buf:          303054         96      2723    146670
HW-tx-start:                 0          0         0         0
HW-tx-proc-desc:        303054         96      2723    146824
TX-Failed:                   0          0         0         0

------------------ end ------------------------------

comment:9 Changed 4 years ago by musti@…

My tests show that on wr741nd v4 crashes are much less likely,they do not occur after a few days as before. Need to test for longer to confirm if fixed.

Previously reported higher packet loss may not be caused by his fix.

comment:10 Changed 4 years ago by musti@…

The crashes still occur on wr741nd but are less often, they appear to occur only in environments with very high WiFi network density. Please find the log attached.

Changed 4 years ago by musti@…

Crash report from wr741nd v4

comment:11 Changed 4 years ago by anonymous

Any idea how this crash can be detected, so we can at least deploy a watchdog?

comment:12 Changed 4 years ago by nbd

You can grep the queues file for "qdepth: 0 ampdu-depth: 0 pending: 123 stopped: 1"

comment:13 Changed 4 years ago by nbd

I just committed another fix in r38017 that might resolve this issue - please test.

comment:14 Changed 4 years ago by k@…

Could this be backported to AA as well?

comment:15 Changed 4 years ago by nbd

The recent fixes don't apply as-is, but once wifi in trunk is fully stable again, I plan on doing a full backport.

comment:16 Changed 4 years ago by anonymous

I have same problem on wr1043nd v1

comment:17 Changed 4 years ago by nbd

when you're reporting an issue, please *always* mention the exact openwrt branch revision that you're using. simply saying "i have the same problem on X" is useless to me.

comment:18 Changed 4 years ago by nbd

please try trunk r38249 or later

comment:19 Changed 4 years ago by musti@…

The problem still occurs in r38249

comment:20 Changed 4 years ago by musti@…

This is how our watchdog is done to detect this problem:

#!/bin/sh

FAIL_MATCH="qdepth: 0 ampdu-depth: 0 pending: 123 stopped: 1"
FAIL_FILES="/sys/kernel/debug/ieee80211/phy*/ath9k/queues"
if [ "`grep "${FAIL_MATCH}" ${FAIL_FILES}`" != "" ]; then
logger "nodewatcher: ath9k freeze detected, rebooting in 30 seconds"
  sleep 30
  logger "nodewatcher: ath9k freeze detected, rebooting"
  reboot
fi

comment:21 Changed 4 years ago by nbd

Committed another fix in r38304, r38305

comment:22 Changed 4 years ago by musti@…

Testing r38305, it will take at least a week to see if there are any improvements. I have a feeling that r38249 has performed better then previous versions, there was a single crash in 7 days.

comment:23 Changed 4 years ago by musti@…

I am happy to report that in the previous 7 days there was no crash, the fix may be assumed to work.

Please backport this to AA as trunk if to unstable for our use.

comment:24 Changed 4 years ago by JBennett@…

Looking at this issue, too, and just wanted to point out, r38305 is the backport. I'm currently testing AA on an 841ndv7, will report back if I run into the same issue.

comment:25 Changed 4 years ago by ben@…

I've been seeing this ath9k freeze periodically on a UBNT Nanostation M5 acting as a gateway node in a mesh, chipset AR7240. Most recently, this just occurred on that node while running AA r38347, with 2 days of previous uptime. The watchdog script that musti provides does catch this freeze and reboots the device as work-around.

Just doing "wifi restart" only seems to un-freeze the driver for a few minutes.

root@nsm5-b:~# cat /etc/config/wireless 
config wifi-device  radio0
	option type     mac80211
	option channel  149
	option macaddr	00:XX:XX:XX:XX:XX
	option hwmode	11na
        list 'ht_capab' 'SHORT-GI-40'
        list 'ht_capab' 'TX-STBC'
        list 'ht_capab' 'RX-STBC1'
        list 'ht_capab' 'DSSS_CCK-40'
	option txpower	20
        option 'country' 'US'
        option 'htmode' 'HT40+'
        option beacon_int       1000
        # REMOVE THIS LINE TO ENABLE WIFI:
#       option disabled 0

config wifi-iface wlan0
        option device   radio0
        option network  'mesh'
        option mode 'adhoc'
        option ssid     MyMesh
       option encryption 'none'
root@nsm5-b:~# cat /sys/kernel/debug/ieee80211/phy*/ath9k/queues
(VO):  qnum: 0 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0
(VI):  qnum: 1 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0
(BE):  qnum: 2 qdepth:  0 ampdu-depth:  0 pending: 124 stopped: 1
(BK):  qnum: 3 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0
(CAB): qnum: 8 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0

comment:26 Changed 4 years ago by ben@…

Following up that I also see this issue sporadically on UBNT Nanostation Loco M2's running AA r38347. Stuck queue debug quoted below. Note that NSM5's in my previous comment appear to exhibit queue length 124 while NSM2 loco's (and likewise Musti's WR741ND) show queue length 123.

root@WasabiNet-cacao:~# cat /sys/kernel/debug/ieee80211/phy*/ath9k/queues
(VO):  qnum: 0 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0
(VI):  qnum: 1 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0
(BE):  qnum: 2 qdepth:  0 ampdu-depth:  0 pending: 123 stopped: 1
(BK):  qnum: 3 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0
(CAB): qnum: 8 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0

comment:27 Changed 4 years ago by nbd

Please test if this also happens on AA with the mac80211 backport package from git://nbd.name/aa-mac80211.git

comment:28 Changed 4 years ago by ben@…

Hi nbd, thank you for the pointer to your updated git repo. I've reflashed the problem devices on my end with AA r39154, and your versions of mac80211 / hostapd. So far so good! I should note the driver freezes for me seem to occur on the order of twice every month, so I need to wait a couple more weeks.

comment:29 Changed 4 years ago by ben@…

Alas, I'm afraid to report that a Nanostation M2 Loco running AA r39154 with the hostapd + mac80211 packages from nbd's repo still appears to have this very intermittent driver freeze. This particular node (which runs on a weekly reboot schedule), has otherwise been fine since reflashing at the time of my previous comment.

This time, the node exhibited stuck queue length 124 instead of 123 as previously. Uptime at the time of failure was only 12-18hours or so.

root@WasabiNet-cacao:~# cat /sys/kernel/debug/ieee80211/phy*/ath9k/queues
(VO):  qnum: 0 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0
(VI):  qnum: 1 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0
(BE):  qnum: 2 qdepth:  0 ampdu-depth:  0 pending: 124 stopped: 1
(BK):  qnum: 3 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0
(CAB): qnum: 8 qdepth:  0 ampdu-depth:  0 pending:   0 stopped: 0

comment:30 Changed 4 years ago by musti@…

I can state that crashes are less occuring with AA r39154 or later, on wr741nd v4 it has not been spotted in the past 3 weeks, but need to wait longer to draw any conclusions.

comment:31 Changed 4 years ago by Dmitry Kanisev

Hello, I have : OpenWrt Attitude Adjustment 12.09-beta | Load: 0.03 0.06 0.15 | Powered by LuCI Trunk (trunk+svn9220)| TP-Link WR741ND

And I have a problem - WiFi signal disappears after a few day after a reboot.
From the comments, I realized that it is necessary to apply the update r39154, right? How to apply the update?

Regerds
Dmitry

comment:32 Changed 4 years ago by anonymous

Kernel Version 3.3.8

comment:33 Changed 4 years ago by Dmitry Kanisev

Just, I reflash the router to:
OpenWrt Attitude Adjustment 12.09 | Load: 0.16 0.12 0.05 | Powered by LuCI 0.11.1 Release (0.11.1) | Kernel Version 3.3.8

I hope it will help me with my problem

Dmitry

comment:34 Changed 4 years ago by Dmitry Kanisev

Unfortunately the new firmware did not help. My WiFi signal is lost (disconnected) in 2 days. Any ideas?

comment:35 Changed 4 years ago by nbd

please try trunk r39688 or newer

comment:36 Changed 4 years ago by musti@…

AA r39154 had a single crash 22days, which is a significant improvement, however the problem persists.

Please create a backport to AA for your newest patch, so we can test it again.

comment:37 Changed 4 years ago by nbd

You can find the backport of the latest trunk mac80211 package to AA in my git repo at git://nbd.name/aa-mac80211.git

comment:38 Changed 4 years ago by ben@…

I reflashed a bunch of UBNT Nanostation Loco M2's with AA r39154 and the most recent hostapd + mac80211 backport from nbd's repo, but unfortunately the issue seems to have worsened! Much more frequent ath9k queue lockups.

Quoting from my crash recovery script (adapted from the nodewatcher one mentioned above) which periodically polls /sys/kernel/debug/ieee80211/phy0/ath9k/queues:

(BE):  qnum: 2 qdepth:  0 ampdu-depth:  0 pending: 123 stopped: 1

I will revert the mac80211 package on these radios back to the version from nbd's repod circa Dec 2013 to confirm.

comment:39 Changed 4 years ago by nbd

Please try current AA

comment:40 Changed 4 years ago by ben@…

I re-flashed several of the Loco M2s that see the most traffic with AA r39928. Looks like the stuck queue problem still persists, occurring perhaps a couple times per day during heavy traffic, but the ath9k driver is now recovering after a few mins (which is a recent addition to ath9k, I'm guessing).

Indeed, I saw such a freeze happen in realtime - all wifi VIFs stopped receiving packets for a bit. I'm attaching a tarball of the files retrieved from /sys/kernel/debug/ieee80211/phy0/ath9k/ (minus regdump).

During the freeze, I saw this in queues:

(BE):  qnum: 2 qdepth:  0 ampdu-depth:  0 pending: 123 stopped: 1

Taking a tip from the last comment in #11862, I tried running "iw scan wlan0" a few times, but no apparent effect on un-sticking the freeze, besides possibly incrementing the queue +1. This is also with TKIP disabled, i.e. encryption = psk2+aes.

(BE):  qnum: 2 qdepth:  0 ampdu-depth:  0 pending: 124 stopped: 1

So excellent progress in that the stuck queue doesn't require reboot to recover from, but unfortunately, still pausing all wireless traffic for a few mins. (Or maybe now the behavior I see is solely that related to #11862.) Oddly enough, I did not see the dreaded "Failed to stop TX DMA" dmesg line during this instance, but I have seen it in previous dmesg entries for this revision of AA, on this radio.

Changed 4 years ago by ben@…

Dump of /sys/kernel/debug/ieee80211/phy0/ath9k/ from UBNT Nano Loco M2 running AA r39928

comment:41 Changed 4 years ago by jow

  • Milestone changed from Attitude Adjustment 12.09 to Barrier Breaker 14.07

Milestone Attitude Adjustment 12.09 deleted

comment:42 Changed 4 years ago by nbd

  • Resolution set to fixed
  • Status changed from new to closed

fixed in current versions

comment:43 Changed 4 years ago by ben@…

I've started testing the recent fixes to ath9k out on Nanostation M2's. So far so good!

mac80211 for Attitude Adjustment didn't yet get this changeset (which looks related):
https://dev.openwrt.org/changeset/41815/

I'm attaching my patch to backport this changeset to AA. Place this in packages/mac80211/patches in your AA build tree.

Changed 4 years ago by ben@…

Backport of trunk changeset 41815 to AA

comment:44 Changed 3 years ago by yangyang

  • Resolution fixed deleted
  • Status changed from closed to reopened

I still hava this problem on latest trunk version, confirmed on 4530R and 3420v1.

SSID would rebroadcast again after reboot, but won't last long.

comment:45 Changed 3 years ago by nahumoz@…

I have a router with Atheros AR9132 (to be more exact buffalo/wzr-hp-g300h) and I seeing the same issue. My wifi crashes about once or twice a day. My OpenWrt version http://downloads.openwrt.org/barrier_breaker/14.07/ar71xx/generic/openwrt-ar71xx-generic-wzr-hp-g300nh-squashfs-sysupgrade.bin

comment:46 Changed 22 months ago by anonymous

How did you guys display that comment #8 output?

Add Comment

Modify Ticket

Action
as reopened .
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.