Modify

Opened 3 years ago

Last modified 3 years ago

#19083 new defect

uhttpd / Luci : New stderr logging reveals errors at refresh of overview page

Reported by: hnyman Owned by:
Priority: normal Milestone: Chaos Calmer 15.05
Component: luci Version: Trunk
Keywords: uhttpd luci Cc: hannu.nyman@…

Description

Recent changes (r44542 - r44548) have modified the logging available to uhttpd and Luci now generates steadily errors to the systems log just by being on the Overview page.

It looks like the automatically refreshed overview page generates errors at every refresh every 5 seconds.

Sat Feb 28 15:48:07 2015 daemon.err uhttpd[2172]: Collected errors:
Sat Feb 28 15:48:07 2015 daemon.err uhttpd[2172]:  * opkg_conf_load: Could not lock /var/lock/opkg.lock: Resource temporarily unavailable.
Sat Feb 28 15:48:07 2015 daemon.err uhttpd[2172]: Collected errors:
Sat Feb 28 15:48:07 2015 daemon.err uhttpd[2172]:  * opkg_conf_load: Could not lock /var/lock/opkg.lock: Resource temporarily unavailable.
Sat Feb 28 15:48:08 2015 daemon.err uhttpd[2172]: iptables: No chain/target/match by that name.

ar71xx WDNR3700 / rev44570 / clean build

Below is a longer extract showing manual logger lines that indicate the opening of the page, and then errors every 5 seconds caused by the automatic refresh.

Sat Feb 28 15:47:49 2015 user.notice root: Try Luci overview page
Sat Feb 28 15:47:52 2015 daemon.err uhttpd[2172]: Collected errors:
Sat Feb 28 15:47:52 2015 daemon.err uhttpd[2172]:  * opkg_conf_load: Could not lock /var/lock/opkg.lock: Resource temporarily unavailable.
Sat Feb 28 15:47:57 2015 daemon.err uhttpd[2172]: Collected errors:
Sat Feb 28 15:47:57 2015 daemon.err uhttpd[2172]:  * opkg_conf_load: Could not lock /var/lock/opkg.lock: Resource temporarily unavailable.
Sat Feb 28 15:47:57 2015 daemon.err uhttpd[2172]: Collected errors:
Sat Feb 28 15:47:57 2015 daemon.err uhttpd[2172]:  * opkg_conf_load: Could not lock /var/lock/opkg.lock: Resource temporarily unavailable.
Sat Feb 28 15:47:58 2015 daemon.err uhttpd[2172]: iptables: No chain/target/match by that name.
Sat Feb 28 15:48:02 2015 user.notice root: Luci overview page completed
Sat Feb 28 15:48:02 2015 daemon.err uhttpd[2172]: Collected errors:
Sat Feb 28 15:48:02 2015 daemon.err uhttpd[2172]:  * opkg_conf_load: Could not lock /var/lock/opkg.lock: Resource temporarily unavailable.
Sat Feb 28 15:48:02 2015 daemon.err uhttpd[2172]: Collected errors:
Sat Feb 28 15:48:02 2015 daemon.err uhttpd[2172]:  * opkg_conf_load: Could not lock /var/lock/opkg.lock: Resource temporarily unavailable.
Sat Feb 28 15:48:03 2015 daemon.err uhttpd[2172]: iptables: No chain/target/match by that name.
Sat Feb 28 15:48:07 2015 daemon.err uhttpd[2172]: Collected errors:
Sat Feb 28 15:48:07 2015 daemon.err uhttpd[2172]:  * opkg_conf_load: Could not lock /var/lock/opkg.lock: Resource temporarily unavailable.
Sat Feb 28 15:48:07 2015 daemon.err uhttpd[2172]: Collected errors:
Sat Feb 28 15:48:07 2015 daemon.err uhttpd[2172]:  * opkg_conf_load: Could not lock /var/lock/opkg.lock: Resource temporarily unavailable.
Sat Feb 28 15:48:08 2015 daemon.err uhttpd[2172]: iptables: No chain/target/match by that name.
Sat Feb 28 15:48:12 2015 daemon.err uhttpd[2172]: Collected errors:
Sat Feb 28 15:48:12 2015 daemon.err uhttpd[2172]:  * opkg_conf_load: Could not lock /var/lock/opkg.lock: Resource temporarily unavailable.
Sat Feb 28 15:48:12 2015 daemon.err uhttpd[2172]: Collected errors:
Sat Feb 28 15:48:12 2015 daemon.err uhttpd[2172]:  * opkg_conf_load: Could not lock /var/lock/opkg.lock: Resource temporarily unavailable.

Attachments (0)

Change History (18)

comment:1 Changed 3 years ago by hnyman

I did some debugging, and this errors are not coming from the core system. Instead they are caused by two luci applets. But the change in logging is probably now revealing the errors :-(

This is caused by luci-app-upnp:

daemon.err uhttpd[2172]: iptables: No chain/target/match by that name.

This is caused by luci-app-ddns:

daemon.err uhttpd[2172]: Collected errors:
daemon.err uhttpd[2172]: * opkg_conf_load: Could not lock /var/lock/opkg.lock: Resource temporarily unavailable.

I will probably write issue ticket at github.

comment:2 Changed 3 years ago by hnyman

Jow has already fixed the upnp related error.

But I noticed one more erroneous plugin:
collectd / Luci statistics causes an error for each statistics page:

Sun Mar  1 11:10:42 2015 daemon.err uhttpd[2293]: /usr/lib/lua/luci/dispatcher.lua:433: Failed to execute call dispatcher target for entry '/admin/statistics/graph/conntrack'.
Sun Mar  1 11:10:42 2015 daemon.err uhttpd[2293]: The called action terminated with an exception:
Sun Mar  1 11:10:42 2015 daemon.err uhttpd[2293]: .../luci/controller/luci_statistics/luci_statistics.lua:153: attempt to use a closed file
Sun Mar  1 11:10:42 2015 daemon.err uhttpd[2293]: stack traceback:
Sun Mar  1 11:10:42 2015 daemon.err uhttpd[2293]: 	[C]: in function 'assert'
Sun Mar  1 11:10:42 2015 daemon.err uhttpd[2293]: 	/usr/lib/lua/luci/dispatcher.lua:433: in function 'dispatch'
Sun Mar  1 11:10:42 2015 daemon.err uhttpd[2293]: 	/usr/lib/lua/luci/dispatcher.lua:168: in function </usr/lib/lua/luci/dispatcher.lua:167>
Sun Mar  1 11:13:11 2015 daemon.err uhttpd[2293]: /usr/lib/lua/luci/dispatcher.lua:433: Failed to execute call dispatcher target for entry '/admin/statistics/graph/cpu'.
Sun Mar  1 11:13:11 2015 daemon.err uhttpd[2293]: The called action terminated with an exception:
Sun Mar  1 11:13:11 2015 daemon.err uhttpd[2293]: .../luci/controller/luci_statistics/luci_statistics.lua:153: attempt to use a closed file
Sun Mar  1 11:13:11 2015 daemon.err uhttpd[2293]: stack traceback:
Sun Mar  1 11:13:11 2015 daemon.err uhttpd[2293]: 	[C]: in function 'assert'
Sun Mar  1 11:13:11 2015 daemon.err uhttpd[2293]: 	/usr/lib/lua/luci/dispatcher.lua:433: in function 'dispatch'
Sun Mar  1 11:13:11 2015 daemon.err uhttpd[2293]: 	/usr/lib/lua/luci/dispatcher.lua:168: in function </usr/lib/lua/luci/dispatcher.lua:167>

comment:3 Changed 3 years ago by hnyman

The reason for the error from line 153 is probably an unnecessary close for the png file:
https://github.com/openwrt/luci/blob/master/applications/luci-app-statistics/luasrc/controller/luci_statistics/luci_statistics.lua#L153

	-- deliver image
	if vars.img then
		local l12 = require "luci.ltn12"
		local png = io.open(graph.opts.imgpath .. "/" .. vars.img:gsub("%.+", "."), "r")
		if png then
			luci.http.prepare_content("image/png")
			l12.pump.all(l12.source.file(png), luci.http.write)
			png:close()

I am no lua expert, but I think that the function source.file already closes the file after the last block has been read, so an explicit close in uci_statistics.lua is probably unnecessary.
https://github.com/openwrt/luci/blob/master/modules/luci-base/luasrc/ltn12.lua#L119

function source.file(handle, io_err)
    if handle then
        return function()
            local chunk = handle:read(BLOCKSIZE)
            if chunk and chunk:len() == 0 then chunk = nil end
            if not chunk then handle:close() end
            return chunk
        end
    else return source.error(io_err or "unable to open file") end
end

comment:4 Changed 3 years ago by daniel.petre

the problem still persist in r44622 while watching the status page of LuCI with the log being filled with messages from uhttpd.

comment:5 Changed 3 years ago by hnyman

I have filed a pull request a week ago in Github, but the devs have not yet picked it up.

https://github.com/openwrt/luci/pull/340

Jow is the active Luci developer, but he has been largely absent the past week.

My own build shows no error messages any more.

comment:6 Changed 3 years ago by daniel.petre

I meant the issue with luci-app-ddns still persists..

comment:7 Changed 3 years ago by hnyman

Yeah, fix for ddns has been similarly authored a week ago, but not yet checked in:
https://github.com/openwrt/luci/pull/337

And chris5560 has also written a similar fix for privoxy, similarly not yet checked in.

comment:8 Changed 3 years ago by anonymous

I'm not sure if this is connected, but in r44633 the Overview page has several fields that do not update ("Collecting data..."):

Local Time -
Uptime -
Load Average -
Memory
Total Available -
Free -
Buffered -
IPv4 WAN Status
IPv6 WAN Status
Active Connections -
DHCP Leases
ADSL Status
Wireless
Associated Stations

comment:9 Changed 3 years ago by hnyman

Those fields update normally in my r44633.

Typical reason, why they do not update, is a dirty browser cache. (some old scripts is there, and Luci has slightly changed.) Try cleaning cache, or try another browser.

comment:10 follow-up: Changed 3 years ago by anonymous

Thanks for your reply, hnyman.

Yes, I'd made my due diligent research and tried that already (cleared cache, tried another browser). It's not that. Here's an excerpt from the system log

Tue Mar 10 13:24:55 2015 daemon.err uhttpd[1954]: Collected errors:
Tue Mar 10 13:24:55 2015 daemon.err uhttpd[1954]: * opkg_conf_load: Could not lock /var/lock/opkg.lock: Resource temporarily unavailable.
Tue Mar 10 13:25:25 2015 daemon.err uhttpd[1954]: sh: write error: Broken pipe

And I have unusually high system load (over 4.10)

comment:11 in reply to: ↑ 10 Changed 3 years ago by hnyman

Replying to anonymous:

Tue Mar 10 13:24:55 2015 daemon.err uhttpd[1954]: * opkg_conf_load: Could not lock /var/lock/opkg.lock: Resource temporarily unavailable.
Tue Mar 10 13:25:25 2015 daemon.err uhttpd[1954]: sh: write error: Broken pipe

And I have unusually high system load (over 4.10)

That "opkg_conf_load: " error might be from luci-app-ddns (or privoxy). Try disabling either of them, and see if the error disappears.

"sh: write error: Broken pipe" looks new in this context.
You might have discovered yet one more app/module/whatever that does not like the new uhttpd logging (or something else).

And you should continue debugging and try to identify (by stopping it), which module causes the "Broken pipe" error and what causes the system load (or CPU consumption).

comment:12 follow-up: Changed 3 years ago by anonymous

Thanks for your suggestion.

At the moment I'm recompiling without luci-app-ddns (yes it was there, it might be the culprit), but it would be better/quicker to debug if I could simply stop some modules from the command line. Is it possible?

Related question: will the flash memory be damaged by flashing new firmware so often?

comment:13 in reply to: ↑ 12 Changed 3 years ago by hnyman

Replying to anonymous:

it would be better/quicker to debug if I could simply stop some modules from the command line. Is it possible?

Sure it is possible:

/etc/init.d/ddns stop

Usually "restart" will start the app again.

/etc/init.d/ddns restart

Check the contents of the /etc/init.d directory to find out, which applications have start/stop scripts there.

root@OpenWrt:~# ls /etc/init.d/
boot             dnsmasq          firewall         luci_statistics  qos              sysfixtime       uhttpd
collectd         done             fstab            miniupnpd        rpcd             sysntpd          umount
cron             dropbear         led              network          sqm              system           usbmode
ddns             etherwake        log              odhcpd           sysctl           telnet           vsftpd

You can also do the start/stop action in Luci in the System/Startup page


Related question: will the flash memory be damaged by flashing new firmware so often?

I have flashed my router about 200 times so far.

comment:14 Changed 3 years ago by anonymous

Again, thank you for your patient replies! No worry with flashing and reflashing, then.

So I recompiled without luci-app-ddns, then stopped the ddns process as you described, rebooted several times, recompiled with luci-app-ddns and stopped again the ddns process etc. Now I am at r44649.

The overview page is still missing that information (the uPnP redirects do show, though). I have no idea who the culprit might be. I don't use privoxy. There still are lots of broken pipe errors in the log, plus many warnings like this:

Tue Mar 10 18:39:05 2015 daemon.warn odhcpd[1110]: DHCPV6 SOLICIT IA_NA from 000100011367e59bc80aa9d1d9c6 on br-lan: ok fd6a:384f:d726:0:224:a5ff:0:95c/128 fdbd:feed:adb1:0:224:a5ff:0:95c/128

which is strange because my devices don't use IPv6 as far as I know.

If you suggest any useful tests I can do, I will be happy to perform them.

comment:15 Changed 3 years ago by anonymous

I forgot to mention: soon after a router reboot, some of the info in the overview page is visible (System, Hostname, Model, Firmware Version, Kernel Version, Local Time, Uptime, Load Average, Memory Total Available, Free, Buffered, Network IPv4 WAN Status (not connected), IPv6 WAN Status (not connected), Active Connections, DHCP Leases (there are no active leases), DHCPv6 Leases (there are no active leases), ADSL Status (DOWN)).

However, even with auto refresh on, the page doesn't refresh, and if I refresh manually after the WAN (ADSL) has connected, the info disappears ("collecting data" or blanks in System Time/Uptime).

comment:16 Changed 3 years ago by juanriccio@…

Same problem here (r44633)

comment:17 Changed 3 years ago by hnyman

The original problem with upnp, ddns (and privoxy) and luci_statistics have been fixed a few hours ago. 44651 and later builds should include those fixes.

If your issue continues with 44651+ builds, you need to debug more to find out which module causes it. (especially the broken pipe is interesting.) The error might be something related to your specific configuration, so others might not be able to reproduce it easily.

You might need to open a separate ticket so that issue do not get mixed.

comment:18 Changed 3 years ago by anonymous

Thank you. You're very kind and patient.

After one night of flashing, testing, recompiling, rebooting, I still haven't understood where the problem is. For my specific setup, r44651 doesn't solve it. I will open a new ticket, and meanwhile go back to stable (for me) r43143 for daily work.

Add Comment

Modify Ticket

Action
as new .
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.