Modify

Opened 10 years ago

Closed 9 years ago

Last modified 4 years ago

#3781 closed defect (fixed)

[r11853] repeated kernel panic on WGT634U

Reported by: gwirth79@… Owned by: developers
Priority: high Milestone: Barrier Breaker 14.07
Component: kernel Version:
Keywords: kernel panic Cc:

Description

Latest Kamikaze revision 11853 repeatedly panics under network load.

While testing the latest build on a Netgear WGT634U, the kernel 2.6.25.10 panics within three minutes of heavy network load transferring from LAN->WAN. I have two computers running iperf <http://iperf.sourceforge.net> to generate network traffic. The maximum bitrate seen with the latest kernel is 8Mbps. Earlier builds, such as r11844 were able to attain 18Mbps but still panicked.

To minimize possible variables, I removed the radio and the madwifi drivers. The only modules loaded were:

root@OpenWrt:/# lsmod
Module Size Used by Not tainted
nf_nat_tftp 448 0
nf_conntrack_tftp 2448 1 nf_nat_tftp
nf_nat_irc 928 0
nf_conntrack_irc 2768 1 nf_nat_irc
nf_nat_ftp 1440 0
nf_conntrack_ftp 5120 1 nf_nat_ftp
ppp_async 9728 0
ppp_generic 20096 1 ppp_async
slhc 5248 1 ppp_generic
crc_ccitt 992 1 ppp_async
switch_robo 4224 0
switch_core 5248 1 switch_robo

I have tried this on three different devices and they all panic. The cause of the panic doesn't seem consistent. I have attached the panic results and my .config file

Attachments (6)

kernel-panic.txt (5.6 KB) - added by gwirth79@… 10 years ago.
kernel panic info from serial console
gus.config-11853 (39.6 KB) - added by gwirth79@… 10 years ago.
config file for kernels that panic
wgt634u_skb_panics.txt (7.2 KB) - added by anonymous 9 years ago.
panics with kallsyms that look like gwirth79's (r12771)
asusrouter_kernel_panic.txt (2.0 KB) - added by tkoecker@… 8 years ago.
Asus WL-700gE Kernel Panic
asusrouter_kernel_panic_2.txt (2.0 KB) - added by tkoecker@… 8 years ago.
Asus WL-700gE kernel panic 2
asusrouter_kernel_panic_3.txt (2.1 KB) - added by tkoecker@… 8 years ago.
Asus WL-700gE kernel panic 3

Download all attachments as: .zip

Change History (15)

Changed 10 years ago by gwirth79@…

kernel panic info from serial console

Changed 10 years ago by gwirth79@…

config file for kernels that panic

comment:1 Changed 10 years ago by gwirth79@…

Some additional testing seems to indicate the problem is in the packet forwarding. I ran a test like this:

On the server (WAN) side I did:

# nc -l 5002 > /dev/null

Then on the WGT634U I ran:

# cat /dev/zero | nc 192.168.1.200 5002

This ran full bore for about 2 Hrs without a problem.

For some reason the BusyBox nc function won't let me connect to the LAN side. It keeps telling me there is no route despite the fact I can ping and ssh to machines on the LAN side. But that's a different bug report.

comment:2 Changed 9 years ago by jhansen@…

This happens here as well. If I set up a netconsole on a LAN device, and have it send the netconsole packets through the WGT634U to a device on the WAN side (i.e. it's just forwarding packets), it will appear to lock up, then eventually crash with a similar dump. I wonder if it is eventually running out of memory and not handling the ENOMEMs properly, because it does take some time before it completely crashes.

Changed 9 years ago by anonymous

panics with kallsyms that look like gwirth79's (r12771)

comment:3 Changed 9 years ago by florian

You guys have been installing so many modules that it becomes tricky to determine which one triggers the bug.

comment:4 Changed 9 years ago by lordjoe@…

Fair enough - I will try to reproduce it with a minimal set of modules. (I am the anonymous who posted the logs with kallsyms enabled).

comment:5 Changed 9 years ago by florian

  • Resolution set to fixed
  • Status changed from new to closed

Assuming this is now fixed with 2.6.28.

comment:6 Changed 9 years ago by gwirth79@…

I tested with SVN 15309 and ran a test for 36 hours using iperf with no failures. Got 24Mbits/sec throughput, which is also an improvement. Looks like it is finally fixed.

comment:7 Changed 8 years ago by tkoecker@…

I have the same problem with a revision 18765 kernel (2.6.30.10) on my Asus WL-700gE (I have a debian userspace, but the kernel is from OpenWRT).

This also happened with older revisions from some weeks ago, the problems started when I moved my router into a different network where I use a different configuration now. I have 2 VLANs (one internal, one external) the internal vlan is merged with wlan using a bridge device. There is masquerading from the internal network to the external one.
Before there was no separation between internal and external network (no VLANs) and no bridge device (wlan was an internal network that was using masquerading to get to the external network).

When I have a lot of traffic I get quite similar kernel panics to those described previously in this ticket. When I start a large file transfer between the (wired) internal network and the external network they can happen after a few seconds up to some hours, in addition they also sometimes happen at boot-time (when the network interfaces are configured).

I'm attaching a few of the stack traces I got through the serial console.

Changed 8 years ago by tkoecker@…

Asus WL-700gE Kernel Panic

Changed 8 years ago by tkoecker@…

Asus WL-700gE kernel panic 2

Changed 8 years ago by tkoecker@…

Asus WL-700gE kernel panic 3

comment:8 Changed 8 years ago by tkoecker@…

Could someone who is able to reopen the bug please reopen it - so that the report does not get lost? Or should I create a new report for the problem?

comment:9 Changed 4 years ago by jow

  • Milestone changed from Attitude Adjustment 12.09 to Barrier Breaker 14.07

Milestone Attitude Adjustment 12.09 deleted

Add Comment

Modify Ticket

Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.