Modify

Opened 7 years ago

Closed 6 years ago

Last modified 4 years ago

#9459 closed defect (fixed)

Kernel bug Wi-Fi brcm47xx on AsusWL500gPv2 (Broadcom)

Reported by: dumghen Owned by: hauke
Priority: highest Milestone: Barrier Breaker 14.07
Component: kernel Version: Trunk
Keywords: broadcom, backfire, kernel, wireless Cc:

Description

Here is dmesg:
Wireless enabled in Ap mode, without encryption. Worked well some time.

b43-phy0 ERROR: DMA RX buffer too small (len: 7950, buffer: 2352, nr-dropped: 4)
b43-phy0 ERROR: DMA RX buffer too small (len: 37746, buffer: 2352, nr-dropped: 17)
skb_over_panic: text:80cda408 len:2378 put:2378 head:80ec1000 data:80ec1040 tail:0x80ec198a end:0x80ec1980 dev:<NULL>
Kernel bug detected[#1]:
Cpu 0
$ 0   : 00000000 1000f800 0000007c 00000001
$ 4   : 80293498 00002e18 ffffffff 00002e18
$ 8   : 00004000 00000000 00000001 ffffffff
$12   : 0000000f 80253c78 ffffffff 00000009
$16   : 00ec1040 0000092c 80ec1040 81841740
$20   : 80e7ee80 0000003e a0e9e3e0 80cdf978
$24   : 00000002 80166250
$28   : 80ec6000 80ec7db0 00000000 801a3d24
Hi    : 00000000
Lo    : 00000077
epc   : 801a3d24 0x801a3d24
    Not tainted
ra    : 801a3d24 0x801a3d24
Status: 1000f803    KERNEL EXL IE
Cause : 00800024
PrId  : 00029029 (Broadcom BCM3302)
Modules linked in: btusb hci_uart hidp bnep rfcomm sco l2cap bluetooth gspca_sonixj gspca_main usbvideo hid v4l2_common videodev v4l1_compat usb_storage usblp snd_usb_audio snd_usb_lib evdev i2c_dev i2c_core uhci_hcd ohci_hcd nf_nat_tftp nf_conntrack_tftp nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp ipt_MASQUERADE iptable_nat nf_nat xt_NOTRACK iptable_raw xt_state nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack ehci_hcd sd_mod ipt_REJECT xt_TCPMSS ipt_LOG xt_comment xt_multiport xt_mac xt_limit iptable_mangle iptable_filter ip_tables xt_tcpudp x_tables ext3 jbd snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_rawmidi snd_seq_device snd_hwdep snd_page_alloc snd soundcore ppp_async vfat fat b43legacy b43 nls_utf8 nls_koi8_r nls_iso8859_2 nls_iso8859_15 nls_iso8859_13 nls_iso8859_1 nls_cp866 nls_cp852 nls_cp850 nls_cp775 nls_cp437 nls_cp1251 nls_cp1250 mac80211 usbcore scsi_mod rfkill nls_base mbcache crc16 crc_ccitt pppoe pppox ppp_generic cfg80211 slhc compat_firmware_class compat input_core arc4 aes_generic deflate ecb cbc switch_robo switch_core diag
Process irq/5-b43 (pid: 1090, threadinfo=80ec6000, task=81a445d0, tls=00000000)
Stack : 00000000 80cda408 0000094a 0000094a 80ec1000 80ec1040 80ec198a 80ec1980
        8026e09c 0000003e a0e9e3e0 80cda408 81a445d0 003d0900 81b7bed0 00000000
        00000000 80001444 80e476e8 1000f801 80cdf978 00010000 00010000 819ee400
        00008000 00010000 80e7ee24 00010000 802ce8c0 00000000 802953b4 80cc72cc
        00010000 802ce8c0 80292f28 8001ce0c 81a445d0 00000000 81b7bed0 00000000
        ...
Call Trace:[<80cda408>] 0x80cda408
[<80cda408>] 0x80cda408
[<80001444>] 0x80001444
[<80cc72cc>] 0x80cc72cc
[<8001ce0c>] 0x8001ce0c
[<8001cef8>] 0x8001cef8
[<80190bf4>] 0x80190bf4
[<80cc7424>] 0x80cc7424
[<80057480>] 0x80057480
[<8000af98>] 0x8000af98
[<80057304>] 0x80057304
[<80057304>] 0x80057304
[<8003db30>] 0x8003db30
[<8000f86c>] 0x8000f86c
[<8003dab4>] 0x8003dab4
[<8000f85c>] 0x8000f85c


Code: afab001c  0c0028ff  afa20020 <0200000d> 08068f4a  00000000  8fbf002c  01201021  03e00008
Disabling lock debugging due to kernel taint
exiting task "irq/5-b43" (1090) is an active IRQ thread (irq 5)

Attachments (0)

Change History (19)

comment:1 follow-up: Changed 7 years ago by jow

Please build the kernel with symbol table information and repost the oops.

comment:2 in reply to: ↑ 1 Changed 7 years ago by dumghen

Replying to jow:

Please build the kernel with symbol table information and repost the oops.

Ohh, sorry, I forget to post that this happened on Backfire 10.03.1-RC4 build.

comment:3 Changed 7 years ago by hauke

  • Owner changed from developers to hauke
  • Status changed from new to accepted

Could you please try trunk and check if this error also occurs there and then please build with symbol table information enabled.

comment:4 Changed 7 years ago by hauke

This issue has some duplicates see #9476, #7438 and comments of #7366.

Here is a summary of the this reported in the duplicate tickets and comments:

We just got bug reports regarding the ASUS WL500GPv2, ASUS WL-520GU and dlink dir-320 with this issue.
It happens after some time ( 3h to 2d ) without any load or anything not normal, in station mode this panic does not occur.

The Call Trace is the following:

Jan  1 04:03:40 OpenWrt user.warn kernel: Call Trace:
Jan  1 04:03:40 OpenWrt user.warn kernel: [<8019bc10>] skb_put+0x74/0x90
Jan  1 04:03:40 OpenWrt user.warn kernel: [<8051a3e8>] b43_dma_rx+0x294/0x378 [b43]
Jan  1 04:03:40 OpenWrt user.warn kernel: [<805072b8>] b43_controller_restart+0x7a8/0x97c [b43]
Jan  1 04:03:40 OpenWrt user.warn kernel: Code: afab001c  0c0028c2  afa20020 <0200000d> 08066f05  00000000  8fbf002c  01201021  03e00008
Jan  1 04:03:40 OpenWrt user.warn kernel: Disabling lock debugging due to kernel taint
Jan  1 04:03:40 OpenWrt user.err kernel: exiting task "irq/5-b43" (763) is an active IRQ thread (irq 5)

After the error occurred the CPU usage increases to 3.0 - 5.0

Here was a discussion about it on b43 mailing list: http://comments.gmane.org/gmane.linux.drivers.bcm54xx.devel/10674

Sometimes "b43-phy0 ERROR: MAC suspend failed" is shown in the log before this panic.

comment:5 Changed 7 years ago by hauke

Please try trunk with with symbol table information enabled and check if this problem still occur. Rafał reported that this should be fixed: http://permalink.gmane.org/gmane.linux.drivers.bcm54xx.devel/12001 in http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=c85ce65ecac078ab1a1835c87c4a6319cf74660a This patch is included in OpenWrt trunk.

comment:6 Changed 7 years ago by Zajec

Sorry, as Larry pointed, I though it's different issue. Recent fix won't help here.

I think Michael can be right, this has something to do with the wireless card firmware. The best option would be to switch to the newest firmware, but this is impossible because of OOM (out of memory) problems (#6907).

My LP-PHY card works fine with the newest firmware, I didn't reproduce any problems locally. I'll try AP mode in few days, however I don't expect to reproduce anything here :(

I started some debugging of OOM problems ("Switching to 4.174.64.19 firmware for G-PHY cards?" and "Out of memory problem with newer firmware" threads on b43-dev), but don't have any fix yet. I plan to get back to this after I finish my BCMA work (max 2 weeks I believe).

comment:7 Changed 7 years ago by Zajec

So far we got problems with 478.104 firmware only (coming from 4.174.64.19 driver packaged as broadcom-wl-4.178.10.4.tar.bz2). Using 410.2160 firmware (from 4.150.10.5, broadcom-wl-4.150.10.5.tar.bz2) was reported as stable.

In #9476 I can see "DMA RX buffer too small" with 410.2160. So it seems this older firmware is not 100% OK as we though.

I can not see debug messages from b43 on WL500gPv2 anywhere. I'd like to see a following line:
Found PHY: Analog X, Type Y, Revision Z
Could someone provide it?

comment:8 Changed 7 years ago by dumghen

Until backfire RC5 is released, I can't try with trunk build because the router is used. :-(
Sorry...

comment:9 Changed 6 years ago by dumghen

In the new RC5 I'm getting this error in dmesg and the wireless isn't working anymore.
I'm using no encryption with hidden SSID.

b43-phy0 ERROR: MAC suspend failed
b43-phy0 ERROR: MAC suspend failed
b43-phy0 ERROR: MAC suspend failed
b43-phy0 ERROR: MAC suspend failed
b43-phy0 ERROR: MAC suspend failed

comment:10 Changed 6 years ago by nbd

please try the latest version

comment:11 follow-up: Changed 6 years ago by dumghen

The router "broke" after trying to enable wireless.
I cannot access anymore the router. Only re-install helped.
Tried from RC6 and from trunk.

comment:12 in reply to: ↑ 11 Changed 6 years ago by r2d2

Replying to dumghen:

The router "broke" after trying to enable wireless.
I cannot access anymore the router. Only re-install helped.

Seems to me this is fixed in trunk (backported allready to backfire in svn) with #6907.

comment:13 Changed 6 years ago by hauke

The out of memory problem is fix in trunk and in the backfire branch. This ticket handles an other issue. Does this error still occur with recent firmware?

{{
b43-phy0 ERROR: DMA RX buffer too small (len: 7950, buffer: 2352, nr-dropped: 4)
b43-phy0 ERROR: DMA RX buffer too small (len: 37746, buffer: 2352, nr-dropped: 17)
skb_over_panic: text:80cda408 len:2378 put:2378 head:80ec1000 data:80ec104
}}

comment:14 Changed 6 years ago by r2d2

My reply refers only to the comment from dumghen. In any case, to help getting this thing tested I installed trunk snapshot r30850 on a 500gpV2 today for use as home ap. Runs 6 hours so far without trouble. I'll write another comment in 2 days for the result. I use psk2 as encryption. Hope this doesn't matters. If so, please leave a comment.

comment:15 follow-up: Changed 6 years ago by test

test

comment:16 in reply to: ↑ 15 ; follow-up: Changed 6 years ago by r2d2

Replying to test:

test

Tricky to get a comment without deny because spam protection :(

No problems so far with the 500gpV2 after two days of runtime.

comment:17 in reply to: ↑ 16 Changed 6 years ago by r2d2

Replying to r2d2:

No problems so far with the 500gpV2 after two days of runtime.

Uptime for 12 days without any bug. Seems fine for me with psk2 and r30850. Thanks!

comment:18 Changed 6 years ago by hauke

  • Resolution set to fixed
  • Status changed from accepted to closed

The problem seams to be fixed now.

comment:19 Changed 4 years ago by jow

  • Milestone changed from Attitude Adjustment 12.09 to Barrier Breaker 14.07

Milestone Attitude Adjustment 12.09 deleted

Add Comment

Modify Ticket

Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.