Modify

Opened 5 years ago

Last modified 21 months ago

#12121 reopened defect

Watchdog on TP-Link TL-WR1043ND can cause freeze instead of reset after some uptime

Reported by: Ivan Mironov <mironov.ivan@…> Owned by: developers
Priority: response-needed Milestone:
Component: kernel Version: 10.03.1
Keywords: watchdog, tl-wr1043nd, ar71xx Cc:

Description

I tried to test hardware watchdog timer on TL-WR1043ND and found that it's not always works properly. For testing I just send SIGSTOP to userspace watchdog daemon. Shortly after boot it works as expected: after few seconds system is just goes to reboot. But after 10 hours or so of uptime, watchdog triggering leads to complete system freeze. All LEDs except "power" turns off and noting more happens. Also, no output into serial port.

For installation I used stable precompiled image.

I don't know exactly what version of board I have. Sticker on the RJ45 jacks block says that it's ver:1.0, but I had problems with disabled WAN port as if it were version 1.8 (according to wiki).

Attachments (1)

successfull-boot-log.txt (11.1 KB) - added by Ivan Mironov <mironov.ivan@…> 5 years ago.

Download all attachments as: .zip

Change History (15)

Changed 5 years ago by Ivan Mironov <mironov.ivan@…>

comment:1 Changed 5 years ago by jow

  • Priority changed from normal to response-needed

Please try 12.09beta and see if the issue persists in current versions.

comment:2 Changed 5 years ago by blogic

  • Resolution set to worksforme
  • Status changed from new to closed

root@OpenWrt:/# uptime

12:00:33 up 12:00, load average: 0.00, 0.01, 0.03

root@OpenWrt:/# killall -9 watchdog
[43240.470000] ath79-wdt: device closed unexpectedly, watchdog timer will not stop!
root@OpenWrt:/#
root@OpenWrt:/#

U-Boot 1.1.4 (Nov 17 2009 - 11:56:26)

AP83 (ar9100) U-boot 0.0.11
DRAM:
sri
32 MB
id read 0x100000ff
flash size 8MB, sector count = 128
Flash: 8 MB
Using default environment

comment:3 Changed 5 years ago by Jérôme Poulin <jeromepoulin@…>

I just encountered this problem after an out of memory issue, I will diagnose the problem shortly and probably re-open the ticket. Watchdog triggers, then all LEDs turn off except Power. AA 12.09 beta2/final.

comment:4 Changed 5 years ago by lz

I have the same problem with all OpenWRT version. (12.09 final and latest trunk)
Sometime reboot is ok, but usualy cause a complete freezing.

comment:5 Changed 4 years ago by anonymous

Hi,

I'm kind of having this issue, but have trouble deterministically reproducing it.

I ssh and issue the command:

# pgrep watchdog | xargs kill -SIGSTOP

then sure enough, after some seconds the router reacts.

  • I am kicked out of ssh, and can't reconnect (conn refused)
  • cannot connect to port 80 (LuCI) anymore
  • the usb-led stoped blinking it's "OK" morse-code

BUT: the box does not reboot, just hangs in this state. -- or it reboots. not sure why sometimes this or that way.

I have tried to look at the process list before crash via:

# while true; do ps; done

sometimes, I saw the following process before beeing kicked out:

{rcS} /bin/sh /etc/init.d/rcS K shutdown

shouldn't that be reboot instead of shutdown?

what I have installed:

firmware: openwrt-ar71xx-generic-tl-wr1043nd-v1-squashfs-factory.bin
my changes:

removed dnsmaq,
install package for morse-led,
let usb-led morse "OK"
put firewall rules in /etc/firewall.user
put wifi in monitor mode

Hardware:
WR-TL1043ND(DE) v1.2
no usb-device plugged in, eth-cable on WAN-port.

# uname -a
Linux OpenWRT 3.3.8 #1 Sat Mar 23 16:49:30 UTC 2013 mips GNU/Linux

# cat /etc/openwrt_release
DISTRIB_ID="OpenWrt"
DISTRIB_RELEASE="12.09"
DISTRIB_REVISION="r36088"
DISTRIB_CODENAME="attitude_adjustment"
DISTRIB_TARGET="ar71xx/generic"
DISTRIB_DESCRIPTION="OpenWrt Attitude Adjustment 12.09"

# opkg list
aircrack-ng - 1.1-3
base-files - 117-r36088
busybox - 1.19.4-6
dropbear - 2011.54-2
firewall - 2-55.1
hotplug2 - 1.0-beta-4
iptables - 1.4.10-4
iw - 3.6-1
jshn - 2013-01-29-0bc317aa4d9af44806c28ca286d79a8b5a92b2b8
kernel - 3.3.8-1-d6597ebf6203328d3519ea3c3371a493
kmod-ath - 3.3.8+2012-09-07-3
kmod-ath9k - 3.3.8+2012-09-07-3
kmod-ath9k-common - 3.3.8+2012-09-07-3
kmod-cfg80211 - 3.3.8+2012-09-07-3
kmod-crypto-aes - 3.3.8-1
kmod-crypto-arc4 - 3.3.8-1
kmod-crypto-core - 3.3.8-1
kmod-gpio-button-hotplug - 3.3.8-1
kmod-ipt-conntrack - 3.3.8-1
kmod-ipt-core - 3.3.8-1
kmod-ipt-nat - 3.3.8-1
kmod-ipt-nathelper - 3.3.8-1
kmod-leds-gpio - 3.3.8-1
kmod-ledtrig-default-on - 3.3.8-1
kmod-ledtrig-morse - 3.3.8-1
kmod-ledtrig-netdev - 3.3.8-1
kmod-ledtrig-timer - 3.3.8-1
kmod-ledtrig-usbdev - 3.3.8-1
kmod-lib-crc-ccitt - 3.3.8-1
kmod-mac80211 - 3.3.8+2012-09-07-3
kmod-nls-base - 3.3.8-1
kmod-ppp - 3.3.8-1
kmod-pppoe - 3.3.8-1
kmod-pppox - 3.3.8-1
kmod-usb-core - 3.3.8-1
kmod-usb-ohci - 3.3.8-1
kmod-usb2 - 3.3.8-1
kmod-wdt-ath79 - 3.3.8-1
libblobmsg-json - 2013-01-29-0bc317aa4d9af44806c28ca286d79a8b5a92b2b8
libc - 0.9.33.2-1
libgcc - 4.6-linaro-1
libip4tc - 1.4.10-4
libiwinfo - 36
libiwinfo-lua - 36
libjson - 0.9-2
liblua - 5.1.4-8
libncurses - 5.7-5
libnl-tiny - 0.1-3
libopenssl - 1.0.1e-1
libpcap - 1.1.1-2
libpopt - 1.7-5
libpthread - 0.9.33.2-1
libubox - 2013-01-29-0bc317aa4d9af44806c28ca286d79a8b5a92b2b8
libubus - 2013-01-13-bf566871bd6a633e4504c60c6fc55b2a97305a50
libubus-lua - 2013-01-13-bf566871bd6a633e4504c60c6fc55b2a97305a50
libuci - 2013-01-04.1-1
libuci-lua - 2013-01-04.1-1
libxtables - 1.4.10-4
lua - 5.1.4-8
luci - 0.11.1-1
luci-app-firewall - 0.11.1-1
luci-i18n-english - 0.11.1-1
luci-lib-core - 0.11.1-1
luci-lib-ipkg - 0.11.1-1
luci-lib-nixio - 0.11.1-1
luci-lib-sys - 0.11.1-1
luci-lib-web - 0.11.1-1
luci-mod-admin-core - 0.11.1-1
luci-mod-admin-full - 0.11.1-1
luci-proto-core - 0.11.1-1
luci-proto-ppp - 0.11.1-1
luci-sgi-cgi - 0.11.1-1
luci-theme-base - 0.11.1-1
luci-theme-openwrt - 0.11.1-1
mtd - 18.1
netifd - 2013-01-29.2-4bb99d4eb462776336928392010b372236ac3c93
openssh-client - 6.1p1-1
opkg - 618-3
ppp - 2.4.5-8
ppp-mod-pppoe - 2.4.5-8
rsync - 3.0.9-1
swconfig - 10
terminfo - 5.7-5
uboot-envtools - 2012.04.01-1
ubus - 2013-01-13-bf566871bd6a633e4504c60c6fc55b2a97305a50
ubusd - 2013-01-13-bf566871bd6a633e4504c60c6fc55b2a97305a50
uci - 2013-01-04.1-1
uhttpd - 2012-10-30-e57bf6d8bfa465a50eea2c30269acdfe751a46fd
wireless-tools - 29-5
wpad-mini - 20120910-1
zlib - 1.2.7-1

comment:6 Changed 4 years ago by max@…

  • Resolution worksforme deleted
  • Status changed from closed to reopened

I caught the error again, this time without foolishly discarding the console so I can copy&paste logread and ps.

I'm not sure what to make of the logread lines...

# logread -f

[...too long ago...]
Jan 2 22:34:00 OpenWRT daemon.info init: starting pid 27729, tty : '/etc/init.d/rcS K shutdown'
Jan 2 22:34:00 OpenWRT authpriv.info dropbear[1272]: Premature exit: Terminated by signal
Jan 2 22:34:01 OpenWRT daemon.notice netifd: Interface 'lan' is now down
Jan 2 22:34:01 OpenWRT kern.info kernel: [ 1541.320000] br-lan: port 1(eth0.1) entered disabled state
Jan 2 22:34:01 OpenWRT kern.info kernel: [ 1541.340000] device eth0.1 left promiscuous mode
Jan 2 22:34:01 OpenWRT kern.info kernel: [ 1541.340000] device eth0 left promiscuous mode
Jan 2 22:34:01 OpenWRT kern.info kernel: [ 1541.340000] br-lan: port 1(eth0.1) entered disabled state
Jan 2 22:34:01 OpenWRT daemon.notice netifd: Interface 'loopback' is now down
Jan 2 22:34:01 OpenWRT daemon.notice netifd: Interface 'wifi0' is now down

# while true; do ps; done

[...]
27729 root 1536 S {rcS} /bin/sh /etc/init.d/rcS K shutdown
27731 root 1540 S {rcS} /bin/sh /etc/init.d/rcS K shutdown
27732 root 1496 S logger -s -p 6 -t sysinit
27738 root 1496 R N ps
27743 root 1532 S {hotplug-call} /bin/sh /sbin/hotplug-call iface
27745 root 852 S /sbin/hotplug2 --override --persistent --set-rules-file /etc/hotplug2.rules --set-coldplug-cmd /sbin/udevtrigger --max-children 1

comment:7 Changed 4 years ago by bittorf@…

i can see the complete freeze (only power-led is on) still on trunk r39203 and i'am attached via a serial cable. here it likely happens during high load but also (but seldom) on low load. on the console i can see nothing, it just stops working. here a log from serial console - every 60 secs is HOSTNAME and /proc/uptime + /proc/loadavg dumped.

http://intercity-vpn.de/files/openwrt/scrollback_buffer_tplink-1043nd_only_power_led_on-reduced.txt

i will do the next console-dump with output every 5 secs and report.

comment:8 follow-up: Changed 3 years ago by wycole

The same problem has been occurring in TP-Link MR3020.

BusyBox v1.19.4 (2013-03-14 11:28:31 UTC) multi-call binary.
Linux OPENWRT 3.3.8 #1 Sat Mar 23 16:49:30 UTC 2013 mips GNU/Linux

Sep 21 18:00:01 OPENWRT daemon.info init: starting pid 1315, tty : '/etc/init.d/rcS K shutdown'
Sep 21 18:00:01 OPENWRT authpriv.info dropbear[1069]: Premature exit: Terminated by signal

Please advise if there is another workaround. It created the illusion of unstable network.

comment:9 in reply to: ↑ 8 Changed 3 years ago by bittorf@…

Replying to wycole:

The same problem has been occurring in TP-Link MR3020.

BusyBox v1.19.4 (2013-03-14 11:28:31 UTC) multi-call binary.
Linux OPENWRT 3.3.8 #1 Sat Mar 23 16:49:30 UTC 2013 mips GNU/Linux

please work with latest trunk and report again. we had a lot of fixed since 2013-03-14.

comment:10 Changed 3 years ago by wycole

Noted with thanks.

comment:11 follow-up: Changed 3 years ago by t.hermans@…

Has anyone still encountered this issue? We are currently having the same issue with the following OpenWRT version/revision:

  • BusyBox v1.19.4 (2014-08-25 16:36:30 CEST)
  • Bleeding Edge, r1254

We are using the Carambola2 module with an AR9331 SoC.
In our case a watchdog trigger reboots the Carambola after which serial communication stops and only the power led is turned on. The device freezes and no bootloader debug messages are shown on reboot.

comment:12 follow-up: Changed 2 years ago by thomas@…

Hello everyone,
hello Ivan,

is there a Root-Cause or Solution available yet for this kind of problem?

We're using OpenWrt 10.03 in combination with an AR9331 which communicates over SPI with an Daughterboard which is based on a Atmel XMEGA256. We found out, that there are no freezes, if we simple trigger a watchdog without SPI communication. If the AR9331 communicates with the daughter board, there are freezes from time to time.

I'm happy about every help/answer.

Thank you,
Thomas

comment:13 in reply to: ↑ 12 Changed 2 years ago by bittorf@…

We're using OpenWrt 10.03 in combination with an AR9331 which communicates over SPI with an Daughterboard which is based on a Atmel XMEGA256. We found out, that there are no freezes, if we simple trigger a watchdog without SPI communication. If the AR9331 communicates with the daughter board, there are freezes from time to time.

can you please check, if you also see the issue with recent trunk?

comment:14 in reply to: ↑ 11 Changed 21 months ago by anonymous

Replying to t.hermans@…:

Has anyone still encountered this issue?

Yes. TL-WR1043ND V1 OpenWrt Chaos Calmer 15.05.1 / LuCI 15.05-149-g0d8bbd2 Release (git-15.363.78009-956be55)

Add Comment

Modify Ticket

Action
as reopened .
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.