Modify

Opened 3 years ago

Last modified 18 months ago

#18616 reopened defect

NETDEV WATCHDOG: eth0 (ag71xx): transmit queue 0 timed out

Reported by: openwrt@… Owned by: developers
Priority: normal Milestone: Chaos Calmer 15.05
Component: kernel Version: Trunk
Keywords: Cc: nbd@…

Description

Hi,

Since kernel 3.14 is used in ar71xx I get this error with a NanoBeam M5 16 (hardware-compatible with NanoStation Loco M5 XW) when the device boots with the Ethernet cable connected:

[   44.050000] ------------[ cut here ]------------
[   44.050000] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0x1e8/0x26c()
[   44.060000] NETDEV WATCHDOG: eth0 (ag71xx): transmit queue 0 timed out
[   44.070000] Modules linked in: ath9k ath9k_common pppoe ppp_async iptable_nat ath9k_hw ath pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv6 nf_conntrack_ipv4 mac80211 ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_nat xt_multiport xt_mark xt_mac xt_limit xt_id xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_CT slhc nf_nat_irc nf_nat_ftp nf_nat nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache nf_conntrack_irc nf_conntrack_ftp nf_conntrack iptable_raw iptable_mangle iptable_filter ipt_REJECT ip_tables crc_ccitt compat ip6t_REJECT ip6table_raw ip6table_mangle ip6table_filter ip6_tables x_tables ipv6 arc4 crypto_blkcipher ohci_platform ohci_hcd ehci_platform ehci_hcd gpio_button_hotplug usbcore nls_base usb_common
[   44.130000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.14.26 #1
[   44.140000] Stack : 00000006 00000000 00000000 00000000 00000000 00000000 803ac8ce 00000032
[   44.140000]    80341408 00000005 802f70f4 8034182f 00000000 803a3b5c 80341408 00000005
[   44.140000]    803ac244 00000001 00000004 8029a1d0 00000003 801ff138 00000108 00000005
[   44.140000]    802fa2b8 80331c74 00000000 00000000 00000000 00000000 00000000 00000000
[   44.140000]    00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[   44.140000]    ...
[   44.170000] Call Trace:
[   44.180000] [<80241100>] show_stack+0x48/0x70
[   44.180000] [<802aa790>] warn_slowpath_common+0x84/0xb4
[   44.190000] [<802aa7ec>] warn_slowpath_fmt+0x2c/0x38
[   44.190000] [<8011102c>] dev_watchdog+0x1e8/0x26c
[   44.200000] [<800ee5f8>] call_timer_fn.isra.38+0x24/0x84
[   44.200000] [<8022a250>] run_timer_softirq+0x17c/0x1bc
[   44.210000] [<8008f5c8>] __do_softirq+0xf8/0x228
[   44.210000] [<801864f0>] irq_exit+0x54/0x70
[   44.210000] [<80060830>] ret_from_irq+0x0/0x4
[   44.220000] [<80210584>] r4k_wait_irqoff+0x18/0x1c
[   44.220000] [<80100584>] cpu_startup_entry+0xa4/0x104
[   44.230000] [<8034e94c>] start_kernel+0x3c8/0x3e0
[   44.230000] 
[   44.240000] ---[ end trace 2b19aa29f924c27c ]---

After that, there is no link on the Ethernet port anymore. However, if I boot the device without the Ethernet cable connected and plug it later, the Ethernet port works perfectly and the error does not appear at all.

Attachments (2)

dmesg.txt (12.9 KB) - added by openwrt@… 3 years ago.
dmesg loco xw r43757.txt (10.2 KB) - added by openwrt@… 3 years ago.
Kernel boot log nanostation loco XW r43757

Download all attachments as: .zip

Change History (17)

comment:1 Changed 3 years ago by nbd

please post a full kernel boot log

Changed 3 years ago by openwrt@…

comment:2 Changed 3 years ago by anonymous

Please see attachment. Lines 185 to 191 are then indefinitely repeated:

185 [ 46.060000] eth0: tx timeout
186 [ 46.060000] eth0: link down
187 [ 46.060000] br-lan: port 1(eth0) entered disabled state
188 [ 46.830000] eth0: link up (100Mbps/Full duplex)
189 [ 46.830000] br-lan: port 1(eth0) entered forwarding state
190 [ 46.840000] br-lan: port 1(eth0) entered forwarding state
191 [ 48.840000] br-lan: port 1(eth0) entered forwarding state

comment:3 Changed 3 years ago by nbd

please try r43757 - make sure you clean your kernel tree after updating.

comment:4 Changed 3 years ago by openwrt@…

Hi ndb,

I've just tried with r43757. The behaviour is the same (no link on Ethernet port) but now I don't see the NETDEV message in the kernel boot log (see attached file).

Thank you very much,

Roger

Changed 3 years ago by openwrt@…

Kernel boot log nanostation loco XW r43757

comment:5 Changed 3 years ago by anonymous

The Ubiquiti NanoBeam M5 300 (NBE-M5-300) is also affected.

comment:6 Changed 3 years ago by nbd

please try r43776

comment:7 Changed 3 years ago by openwrt@…

Hi ndb,

r43776 works great :)

Thank you very much!

comment:8 Changed 3 years ago by nbd

  • Resolution set to fixed
  • Status changed from new to closed

comment:9 Changed 2 years ago by uysiquebien

  • Resolution fixed deleted
  • Status changed from closed to reopened

This happens randomly on my TL-WR741ND which runs r47335. I can either reboot or log in from a wireless host and do "ifconfig eth0 down; ifconfig eth0 up" to be able to use my ethernet ports again.

The following lines are repeated forever:

Fri Nov  6 21:51:55 2015 kern.info kernel: [ 5117.370000] br-lan: port 1(eth0) entered forwarding state
Fri Nov  6 21:51:59 2015 kern.info kernel: [ 5121.020000] eth0: tx timeout
Fri Nov  6 21:51:59 2015 kern.info kernel: [ 5121.020000] eth0: link down
Fri Nov  6 21:51:59 2015 kern.info kernel: [ 5121.020000] br-lan: port 1(eth0) entered disabled state
Fri Nov  6 21:51:59 2015 daemon.notice netifd: Network device 'eth0' link is down
Fri Nov  6 21:51:59 2015 kern.info kernel: [ 5121.360000] eth0: link up (1000Mbps/Full duplex)
Fri Nov  6 21:51:59 2015 kern.info kernel: [ 5121.360000] br-lan: port 1(eth0) entered forwarding state
Fri Nov  6 21:51:59 2015 kern.info kernel: [ 5121.370000] br-lan: port 1(eth0) entered forwarding state
Fri Nov  6 21:51:59 2015 daemon.notice netifd: Network device 'eth0' link is up
Fri Nov  6 21:52:01 2015 kern.info kernel: [ 5123.370000] br-lan: port 1(eth0) entered forwarding state
Fri Nov  6 21:52:05 2015 kern.info kernel: [ 5127.020000] eth0: tx timeout
Fri Nov  6 21:52:05 2015 kern.info kernel: [ 5127.020000] eth0: link down
Fri Nov  6 21:52:05 2015 kern.info kernel: [ 5127.020000] br-lan: port 1(eth0) entered disabled state
Fri Nov  6 21:52:05 2015 daemon.notice netifd: Network device 'eth0' link is down
Fri Nov  6 21:52:05 2015 kern.info kernel: [ 5127.360000] eth0: link up (1000Mbps/Full duplex)
Fri Nov  6 21:52:05 2015 kern.info kernel: [ 5127.360000] br-lan: port 1(eth0) entered forwarding state
Fri Nov  6 21:52:05 2015 kern.info kernel: [ 5127.370000] br-lan: port 1(eth0) entered forwarding state
Fri Nov  6 21:52:05 2015 daemon.notice netifd: Network device 'eth0' link is up
Fri Nov  6 21:52:07 2015 kern.info kernel: [ 5129.370000] br-lan: port 1(eth0) entered forwarding state
Fri Nov  6 21:52:07 2015 kern.info kernel: [ 5129.580000] eth0: link down
Fri Nov  6 21:52:07 2015 kern.info kernel: [ 5129.580000] br-lan: port 1(eth0) entered disabled state
Fri Nov  6 21:52:07 2015 daemon.notice netifd: Network device 'eth0' link is down
Fri Nov  6 21:52:10 2015 kern.info kernel: [ 5132.500000] eth0: link up (1000Mbps/Full duplex)
Fri Nov  6 21:52:10 2015 kern.info kernel: [ 5132.500000] br-lan: port 1(eth0) entered forwarding state
Fri Nov  6 21:52:10 2015 kern.info kernel: [ 5132.510000] br-lan: port 1(eth0) entered forwarding state
Fri Nov  6 21:52:10 2015 daemon.notice netifd: Network device 'eth0' link is up

And the same warning:

Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.020000] ------------[ cut here ]------------
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.020000] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:311 dev_watchdog+0x1d8/0x25c()
Fri Nov  6 21:51:53 2015 kern.info kernel: [ 5115.030000] NETDEV WATCHDOG: eth0 (ag71xx): transmit queue 0 timed out
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.030000] Modules linked in: imq ath9k ath9k_common pppoe ppp_async iptable_nat ath9k_hw ath pptp pppox ppp_mppe ppp_generic nf_nat_ipv4 nf_conntrack_ipv4 mac80211 ipt_REJECT ipt_MASQUERADE ebtable_nat ebtable_filter ebtable_broute cfg80211 xt_time xt_tcpudp xt_tcpmss xt_string xt_statistic xt_state xt_recent xt_quota xt_pkttype xt_physdev xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_iprange xt_id xt_hl xt_helper xt_ecn xt_dscp xt_conntraFri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.160000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.23 #1
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.170000] Stack : 00000000 00000000 00000000 00000000 803b4292 00000032 00000000 8025056c
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.170000] 	  00000001 000000de 80308100 8035e6e3 00000000 803b34d0 8035ea18 000000de
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.170000] 	  803b3918 00000001 00000004 800a35cc 00000003 800803bc 00000137 000000de
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.170000] 	  8030b6f8 80359c6c 00000000 00000000 00000000 00000000 00000000 00000000
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.170000] 	  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.170000] 	  ...
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.200000] Call Trace:
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.200000] [<80070fe0>] show_stack+0x50/0x84
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.210000] [<800804c4>] warn_slowpath_common+0x84/0xb4
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.210000] [<80080520>] warn_slowpath_fmt+0x2c/0x38
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.220000] [<8025056c>] dev_watchdog+0x1d8/0x25c
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.220000] [<800ab068>] call_timer_fn.isra.38+0x24/0x84
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.230000] [<800ab880>] run_timer_softirq+0x1bc/0x1f8
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.230000] [<800829c4>] __do_softirq+0xf8/0x230
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.240000] [<80082d30>] irq_exit+0x54/0x70
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.240000] [<80060830>] ret_from_irq+0x0/0x4
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.250000] [<80060a80>] __r4k_wait+0x20/0x40
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.250000] [<8009f564>] cpu_startup_entry+0xf0/0x13c
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.260000] [<80377b5c>] start_kernel+0x42c/0x444
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.260000] 
Fri Nov  6 21:51:53 2015 kern.warn kernel: [ 5115.260000] ---[ end trace 3b5eef35d2354a9a ]---

comment:10 Changed 2 years ago by anonymous

Could this be the same as #18922?

comment:11 Changed 2 years ago by psyke83

The original bug was probably different, but the issue experienced by uysiquebien should definitely be fixed, yes. There was a flaw in the driver: when the tx queue became full, it stopped the transmit queue, but neglected to subsequently wake the queue, causing tx timeouts until the interface was brought down and back up.

The fix should affect all SOCs using the ar71xx drivers, so a lot of open bug reports mentioning "tx timeout" or "transmit queue 0 timed out" should be solved by this (some of them 4+ years old).

comment:12 Changed 2 years ago by nbd

  • Resolution set to fixed
  • Status changed from reopened to closed

fixed in r47892, r47895

comment:13 Changed 18 months ago by nw@…

  • Resolution fixed deleted
  • Status changed from closed to reopened

I have build an image from the latest git tree. I am on:
Linux OpenWrt 4.4.14 #2 Tue Aug 9 06:01:58 UTC 2016 mips GNU/Linux
and face the same behaviour as described above:

[ 39.057539] ------------[ cut here ]------------
[ 39.060746] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:306 dev_watchdog+0x1dc/0x2
[ 39.069154] NETDEV WATCHDOG: eth0 (ag71xx): transmit queue 0 timed out
[ 39.075639] Modules linked in: ath9k ath9k_common pppoe ppp_async iptable_nat ath9k_hw a
[ 39.148741] CPU: 0 PID: 0 Comm: swapper Not tainted 4.4.14 #2
[ 39.154460] Stack : 803ba184 00000000 00000001 80410000 804010d8 80400d63 8039b8b4 00000
[ 39.154460] 80463790 803ff3fc 00000200 00100000 00000100 800a678c 803a0ef8 80400000
[ 39.154460] 00000003 803ff3fc 8039f304 803fbc2c 00000100 800a4758 00000006 00000000
[ 39.154460] 00000000 801f2b00 00000000 00000000 00000000 00000000 00000000 00000000
[ 39.154460] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 39.154460] ...
[ 39.189874] Call Trace:
[ 39.192324] [<8007197c>] show_stack+0x50/0x84
[ 39.196664] [<80081580>] warn_slowpath_common+0xa0/0xd0
[ 39.201880] [<800815dc>] warn_slowpath_fmt+0x2c/0x38
[ 39.206814] [<80280418>] dev_watchdog+0x1dc/0x260
[ 39.211531] [<800af8a0>] call_timer_fn.isra.5+0x24/0x80
[ 39.216710] [<800afb1c>] run_timer_softirq+0x1ac/0x1f4
[ 39.221850] [<800839dc>] do_softirq+0x164/0x290
[ 39.226538] [<8006a5d0>] plat_irq_dispatch+0xbc/0xf4
[ 39.231470]
[ 39.232933] ---[ end trace 1fc442cc2d86308f ]---

comment:14 Changed 18 months ago by anonymous

As mentioned in the first message of this thread, this also is true here:
If I boot the device without the Ethernet cable connected and plug it later, the Ethernet port works perfectly and the error does not appear at all. My device is a TP-Link TL-WR710N v2.1

comment:15 Changed 18 months ago by ufo@…

with lede r913 (also kernel 4.4.14) on a tplink 841 i also had that problem. reboot wasnt solving that problem, but on the next day the device was working properly

[ 43.060948] ------------[ cut here ]------------
[ 43.065773] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:306 dev_watchdog+0x1dc/0x260()
[ 43.074503] NETDEV WATCHDOG: eth0 (ag71xx): transmit queue 0 timed out
[ 43.081245] Modules linked in: iptable_nat ip6table_nat ath9k nf_nat_ipv6 nf_nat_ipv4 nf_conntrack_ipv6 nf_conntrack_ipv4 ipt_REJECT ipt_MASQUERADE ath9k_common xt_time xt_tcpudp xt_state xt_nat xt_multiport xt_mark xt_mac xt_limit xt_id xt_conntrack xt_comment xt_TCPMSS xt_REDIRECT xt_LOG xt_CT nf_reject_ipv4 nf_nat_redirect nf_nat_masquerade_ipv4 nf_log_ipv4 nf_defrag_ipv6 nf_defrag_ipv4 nf_conntrack_rtcache mac80211 iptable_raw iptable_mangle iptable_filter ip_tables ath9k_hw ath batman_adv libcrc32c cfg80211 compat ip6t_NPT ip6t_MASQUERADE nf_nat_masquerade_ipv6 nf_nat nf_conntrack ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 nf_log_common ip6table_raw ip6table_mangle ip6table_filter ip6_tables x_tables gpio_button_hotplug crc16 crc32c_generic crypto_hash
[ 43.150705] CPU: 0 PID: 0 Comm: swapper Not tainted 4.4.14 #3
[ 43.156644] Stack : 803e06c4 00000000 00000001 80430000 8042b2d8 8042af63 803c1db4 00000000
[ 43.156644] 80493790 804295dc 00000200 00100000 00000100 800a6834 803c73f0 80420000
[ 43.156644] 00000003 804295dc 803c57fc 80425c2c 00000100 800a4800 00000006 00000000
[ 43.156644] 00000000 801f5300 00000000 00000000 00000000 00000000 00000000 00000000
[ 43.156644] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 43.156644] ...
[ 43.193676] Call Trace:
[ 43.196219] [<80071a24>] show_stack+0x50/0x84
[ 43.200735] [<80081628>] warn_slowpath_common+0xa0/0xd0
[ 43.206148] [<80081684>] warn_slowpath_fmt+0x2c/0x38
[ 43.211288] [<8028b884>] dev_watchdog+0x1dc/0x260
[ 43.216172] [<800af948>] call_timer_fn.isra.5+0x24/0x80
[ 43.221585] [<800afbc4>] run_timer_softirq+0x1ac/0x1f4
[ 43.226903] [<80083a84>] do_softirq+0x164/0x290
[ 43.231792] [<8006a678>] plat_irq_dispatch+0xbc/0xf4
[ 43.236917]
[ 43.238460] ---[ end trace f8acfde64a765533 ]---
[ 43.243236] eth0: tx timeout

https://plan.leipzig.freifunk.net/issues/311

Add Comment

Modify Ticket

Action
as reopened .
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.