Modify

Opened 2 years ago

Closed 2 years ago

#21330 closed defect (fixed)

ar71xx r47811 has memery leak

Reported by: anonymous Owned by: developers
Priority: normal Milestone:
Component: packages Version: Trunk
Keywords: Cc:

Description

cat /proc/meminfo
MemTotal:          61332 kB
MemFree:           30956 kB
MemAvailable:      37056 kB
Buffers:            1828 kB
Cached:             9056 kB
SwapCached:            0 kB
Active:             6972 kB
Inactive:           5404 kB
Active(anon):       1552 kB
Inactive(anon):     2932 kB
Active(file):       5420 kB
Inactive(file):     2472 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:          1500 kB
Mapped:             2216 kB
Shmem:              2992 kB
Slab:               6768 kB
SReclaimable:       1340 kB
SUnreclaim:         5428 kB
KernelStack:         312 kB
PageTables:          264 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:       30664 kB
Committed_AS:       7372 kB
VmallocTotal:    1048372 kB
VmallocUsed:        1640 kB
VmallocChunk:    1024244 kB

# ps
  PID USER       VSZ STAT COMMAND
    1 root      1492 S    /sbin/procd
    2 root         0 SW   [kthreadd]
    3 root         0 SW   [ksoftirqd/0]
    4 root         0 SW   [kworker/0:0]
    5 root         0 SW<  [kworker/0:0H]
    6 root         0 SW   [kworker/u2:0]
    7 root         0 SW<  [khelper]
   63 root         0 SW<  [writeback]
   65 root         0 SW<  [crypto]
   66 root         0 SW<  [bioset]
   68 root         0 SW<  [kblockd]
   98 root         0 SW   [kswapd0]
   99 root         0 SW   [kworker/0:1]
  148 root         0 SW   [fsnotify_mark]
  170 root         0 SW   [spi0]
  274 root         0 SW<  [ipv6_addrconf]
  280 root         0 SW<  [deferwq]
  283 root         0 SW<  [kworker/0:1H]
  342 root         0 SWN  [jffs2_gcd_mtd3]
  396 root      1152 S    /sbin/ubusd
  397 root       856 S    /sbin/askfirst /bin/ash --login
  607 root         0 SW<  [cfg80211]
  704 root      1140 S    /sbin/logd -S 16
  784 root      1164 S    /usr/sbin/crond -f -c /etc/crontabs -l 9
  877 root      1164 S    /usr/sbin/ntpd -n -l -S /usr/sbin/ntpd-hotplug -p 0.asia.pool.ntp.org
 6359 root         0 SW   [kworker/u2:1]
 6837 root      1656 S    /sbin/netifd
 7162 root      1172 S    /usr/sbin/pppd nodetach ipparam wan ifname pppoe-wan lcp-echo-interval 5 lcp-echo-failure 6 lcp-echo-adaptive +ipv6 set AUTOIPV6=1 nodefaultroute u
 7260 root      1596 S    /usr/sbin/hostapd -P /var/run/wifi-phy0.pid -B /var/run/hostapd-phy0.conf
 7334 root       880 S    odhcp6c -s /lib/netifd/dhcpv6.script -P0 -t120 pppoe-wan
 7404 nobody    1020 S    /usr/sbin/dnsmasq -C /var/etc/dnsmasq.conf -k -x /var/run/dnsmasq/dnsmasq.pid
 7677 root      1008 S    /usr/sbin/dropbear -F -P /var/run/dropbear.1.pid -s -g -p 22 -K 300
 7696 root      1180 S    /usr/sbin/miniupnpd -f /var/etc/miniupnpd.conf
 7709 root      1252 S    /usr/sbin/odhcpd
 8228 root      1072 S    /usr/sbin/dropbear -F -P /var/run/dropbear.1.pid -s -g -p 22 -K 300
 8229 root      1164 S    -ash
 8279 root         0 Z    [sh]
 8280 root         0 Z    [sh]
 8292 root      1160 R    ps

MemFree is about 40000 kB after boot, but after a few hours it will drop to about 30000 kB.
The ealier version doesn't have this problem. TL-WR741ND-v4

Attachments (0)

Change History (19)

comment:1 Changed 2 years ago by anonymous

r47793 has no memery leak.

comment:2 follow-up: Changed 2 years ago by anon2

sounds vague. There have been no real changes in the core openwrt between 47793 and 47811 that could have that kind of effect.
https://dev.openwrt.org/changeset?new=47811%40trunk&old=47793%40trunk
https://dev.openwrt.org/log/trunk

Does the free memory continue to drop after reaching that ~30M level?

If you check with top or htop, is the memory consumed by a certain process growing?

comment:3 Changed 2 years ago by anonymous

My mistake.
r47793 also has memery leak.

MemTotal:          61332 kB
MemFree:           15716 kB
MemAvailable:      21684 kB
Buffers:            1828 kB
Cached:             6128 kB
SwapCached:            0 kB
Active:             6040 kB
Inactive:           3316 kB
Active(anon):       1456 kB
Inactive(anon):       48 kB
Active(file):       4584 kB
Inactive(file):     3268 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:          1408 kB
Mapped:             2192 kB
Shmem:               104 kB
Slab:               7280 kB
SReclaimable:       1156 kB
SUnreclaim:         6124 kB
KernelStack:         312 kB
PageTables:          264 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:       30664 kB
Committed_AS:       4412 kB
VmallocTotal:    1048372 kB
VmallocUsed:        1640 kB
VmallocChunk:    1040300 kB

comment:4 in reply to: ↑ 2 Changed 2 years ago by anonymous

Replying to anon2:

sounds vague. There have been no real changes in the core openwrt between 47793 and 47811 that could have that kind of effect.
https://dev.openwrt.org/changeset?new=47811%40trunk&old=47793%40trunk
https://dev.openwrt.org/log/trunk

Does the free memory continue to drop after reaching that ~30M level?

If you check with top or htop, is the memory consumed by a certain process growing?

Yes, I'm using r47793 now. But it has the same problem

# top -bn1
Mem: 45704K used, 15628K free, 104K shrd, 1828K buff, 6132K cached
CPU:   0% usr   9% sys   0% nic  90% idle   0% io   0% irq   0% sirq
Load average: 0.08 0.05 0.05 1/39 5402
  PID  PPID USER     STAT   VSZ %VSZ %CPU COMMAND
 1046     2 root     SW       0   0%   5% [kworker/u2:2]
    3     2 root     SW       0   0%   5% [ksoftirqd/0]
  739     1 root     S     1656   3%   0% /sbin/netifd
 1047     1 root     S     1596   3%   0% /usr/sbin/hostapd -P /var/run/wifi-phy0.pid -B /var/run/hostapd-phy0.conf
    1     0 root     S     1492   2%   0% /sbin/procd
  705     1 root     S     1300   2%   0% /sbin/logread -f -r 192.168.xx.xx 514 -p /var/run/logread.2.pid -u
  762     1 root     S     1252   2%   0% /usr/sbin/odhcpd
 1035   739 root     S     1172   2%   0% /usr/sbin/pppd nodetach ipparam wan ifname pppoe-wan lcp-echo-interval 5 lcp-echo-failure 6 lcp-echo-adaptive +ipv6 set AUTOIPV6=1 nodefaultroute usepeerdns maxfail 1 user xxxxxx password xxxxx ip-up-script /lib/netifd/ppp-up ipv6-up-script /lib/netifd/ppp-up ip-down-script /lib/netifd/ppp-down ipv6-down-script /lib/netifd/ppp-down mtu 1492 mru 1492 plugin rp-pppoe.so nic-eth1 persist
  784     1 root     S     1164   2%   0% /usr/sbin/crond -f -c /etc/crontabs -l 9
  877     1 root     S     1164   2%   0% /usr/sbin/ntpd -n -l -S /usr/sbin/ntpd-hotplug -p 0.asia.pool.ntp.org
 5323  5322 root     S     1160   2%   0% -ash
 5402  5323 root     R     1160   2%   0% top -bn1
  841     1 root     S     1160   2%   0% xxxxxxxxxx
  396     1 root     S     1148   2%   0% /sbin/ubusd
  704     1 root     S     1140   2%   0% /sbin/logd -S 16
 1419     1 root     S     1120   2%   0% /usr/sbin/miniupnpd -f /var/etc/miniupnpd.conf
 5322   805 root     S     1072   2%   0% /usr/sbin/dropbear -F -P /var/run/dropbear.1.pid -s -g -p 22 -K 300
 1549     1 nobody   S     1016   2%   0% /usr/sbin/dnsmasq -C /var/etc/dnsmasq.conf -k -x /var/run/dnsmasq/dnsmasq.pid
  805     1 root     S     1008   2%   0% /usr/sbin/dropbear -F -P /var/run/dropbear.1.pid -s -g -p 22 -K 300
 1404   739 root     S      880   1%   0% odhcp6c -s /lib/netifd/dhcpv6.script -P0 -t120 pppoe-wan
  397     1 root     S      856   1%   0% /sbin/askfirst /bin/ash --login
   99     2 root     SW       0   0%   0% [kworker/0:1]
    6     2 root     SW       0   0%   0% [kworker/u2:0]
  170     2 root     SW       0   0%   0% [spi0]
  342     2 root     SWN      0   0%   0% [jffs2_gcd_mtd3]
  607     2 root     SW<      0   0%   0% [cfg80211]
  280     2 root     SW<      0   0%   0% [deferwq]
  274     2 root     SW<      0   0%   0% [ipv6_addrconf]
  283     2 root     SW<      0   0%   0% [kworker/0:1H]
   68     2 root     SW<      0   0%   0% [kblockd]
   66     2 root     SW<      0   0%   0% [bioset]
   98     2 root     SW       0   0%   0% [kswapd0]
  148     2 root     SW       0   0%   0% [fsnotify_mark]
    7     2 root     SW<      0   0%   0% [khelper]
   63     2 root     SW<      0   0%   0% [writeback]
    4     2 root     SW       0   0%   0% [kworker/0:0]
    2     0 root     SW       0   0%   0% [kthreadd]
    5     2 root     SW<      0   0%   0% [kworker/0:0H]
   65     2 root     SW<      0   0%   0% [crypto]

comment:5 Changed 2 years ago by anonymous

r47665 has no such problem. Sorry for have no other version saved for testing.

comment:6 follow-up: Changed 2 years ago by George-TL

Do You have working 'automatic recovery by tftp from uboot by holding button during device boot'?
According to revisions You are using trunk images. Right?

Please confirm that:
r47793 - fails
r47665 - works

I can build some images for You.
After about 6th try we should narrow commit which triggers this issue.
https://dev.openwrt.org/log/?action=stop_on_copy&mode=stop_on_copy&rev=47793&stop_rev=47665&limit=200&mail_addr=&mail_addr_confirm=

Did You build OpenWRT Yourself? In that case I need output from "./scripts/diffconfig.sh".

comment:7 in reply to: ↑ 6 Changed 2 years ago by xinglp@…

Replying to George-TL:

Do You have working 'automatic recovery by tftp from uboot by holding button during device boot'?
According to revisions You are using trunk images. Right?

Yes. And I also have made my flash ic pluggable, and I have spi programmer for it.

Please confirm that:
r47793 - fails
r47665 - works

confirmed.

I can build some images for You.
After about 6th try we should narrow commit which triggers this issue.
https://dev.openwrt.org/log/?action=stop_on_copy&mode=stop_on_copy&rev=47793&stop_rev=47665&limit=200&mail_addr=&mail_addr_confirm=

I can do it this weekend.

Did You build OpenWRT Yourself? In that case I need output from "./scripts/diffconfig.sh".

CONFIG_TARGET_ar71xx=y
CONFIG_TARGET_ar71xx_generic=y
CONFIG_TARGET_ar71xx_generic_TLWR741=y
CONFIG_DEVEL=y
CONFIG_BUSYBOX_CUSTOM=y
CONFIG_BUSYBOX_CONFIG_ARPING=y
# CONFIG_BUSYBOX_CONFIG_BUNZIP2 is not set
# CONFIG_BUSYBOX_CONFIG_CHROOT is not set
CONFIG_BUSYBOX_CONFIG_ETHER_WAKE=y
CONFIG_BUSYBOX_CONFIG_FEATURE_CHECK_TAINTED_MODULE=y
CONFIG_BUSYBOX_CONFIG_FEATURE_CROND_DIR="/var/spool/cron"
CONFIG_BUSYBOX_CONFIG_FEATURE_DD_THIRD_STATUS_LINE=y
# CONFIG_BUSYBOX_CONFIG_FEATURE_DEVPTS is not set
CONFIG_BUSYBOX_CONFIG_FEATURE_EDITING_SAVEHISTORY=y
CONFIG_BUSYBOX_CONFIG_FEATURE_LSMOD_PRETTY_2_6_OUTPUT=y
# CONFIG_BUSYBOX_CONFIG_FEATURE_MOUNT_CIFS is not set
# CONFIG_BUSYBOX_CONFIG_FEATURE_PASSWD_WEAK_CHECK is not set
CONFIG_BUSYBOX_CONFIG_FEATURE_REVERSE_SEARCH=y
# CONFIG_BUSYBOX_CONFIG_FEATURE_SH_NOFORK is not set
CONFIG_BUSYBOX_CONFIG_FEATURE_STAT_FORMAT=y
CONFIG_BUSYBOX_CONFIG_FEATURE_TAR_AUTODETECT=y
CONFIG_BUSYBOX_CONFIG_FEATURE_TRACEROUTE_USE_ICMP=y
CONFIG_BUSYBOX_CONFIG_FEATURE_WGET_TIMEOUT=y
# CONFIG_BUSYBOX_CONFIG_FREE is not set
# CONFIG_BUSYBOX_CONFIG_HWCLOCK is not set
# CONFIG_BUSYBOX_CONFIG_INCLUDE_SUSv2 is not set
CONFIG_BUSYBOX_CONFIG_INSMOD=y
CONFIG_BUSYBOX_CONFIG_LSMOD=y
# CONFIG_BUSYBOX_CONFIG_MKSWAP is not set
CONFIG_BUSYBOX_CONFIG_NC_110_COMPAT=y
CONFIG_BUSYBOX_CONFIG_NC_EXTRA=y
CONFIG_BUSYBOX_CONFIG_NC_SERVER=y
CONFIG_BUSYBOX_CONFIG_OD=y
CONFIG_BUSYBOX_CONFIG_PSCAN=y
CONFIG_BUSYBOX_CONFIG_PSTREE=y
CONFIG_BUSYBOX_CONFIG_RMMOD=y
CONFIG_BUSYBOX_CONFIG_STAT=y
# CONFIG_BUSYBOX_CONFIG_STRINGS is not set
CONFIG_BUSYBOX_CONFIG_VCONFIG=y
CONFIG_BUSYBOX_CONFIG_WHOIS=y
CONFIG_CLEAN_IPKG=y
CONFIG_DOWNLOAD_FOLDER="################"
# CONFIG_KERNEL_DEBUG_INFO is not set
# CONFIG_KERNEL_DEBUG_KERNEL is not set
# CONFIG_KERNEL_ELF_CORE is not set
# CONFIG_KERNEL_KALLSYMS is not set
# CONFIG_KERNEL_MAGIC_SYSRQ is not set
# CONFIG_KERNEL_PRINTK_TIME is not set
# CONFIG_KERNEL_SWAP is not set
CONFIG_PACKAGE_6in4=y
CONFIG_PACKAGE_6to4=y
# CONFIG_PACKAGE_ATH_DFS is not set
# CONFIG_PACKAGE_MAC80211_MESH is not set
# CONFIG_PACKAGE_dnsmasq is not set
CONFIG_PACKAGE_dnsmasq-full=y
CONFIG_PACKAGE_dnsmasq_full_ipset=y
# CONFIG_PACKAGE_kmod-gpio-button-hotplug is not set
CONFIG_PACKAGE_kmod-ipt-ipset=y
CONFIG_PACKAGE_kmod-iptunnel=y
CONFIG_PACKAGE_kmod-iptunnel4=y
CONFIG_PACKAGE_kmod-nfnetlink=y
CONFIG_PACKAGE_kmod-sit=y
CONFIG_PACKAGE_libnfnetlink=y
CONFIG_PACKAGE_miniupnpd=y
# CONFIG_PACKAGE_opkg is not set
CONFIG_PACKAGE_relayd=y
CONFIG_STRIP_KERNEL_EXPORTS=y
CONFIG_USE_MKLIBS=y
# CONFIG_PACKAGE_dnsmasq_full_auth is not set
# CONFIG_PACKAGE_dnsmasq_full_dhcpv6 is not set
# CONFIG_PACKAGE_dnsmasq_full_dnssec is not set
# CONFIG_PACKAGE_libgmp is not set
# CONFIG_PACKAGE_libnettle is not set

comment:8 Changed 2 years ago by George-TL

Try this one (r47716): http://www.multiupfile.com/f/063a109f
4bd7ece7d86f5ca84ce27720244f7e7ebeef1946 ticket-21330_r47716_openwrt-ar71xx-generic-tl-wr741nd-v4-squashfs.7z

Last edited 2 years ago by George-TL (previous) (diff)

comment:9 follow-up: Changed 2 years ago by George-TL

If r47716 fails then try:

r47693 - http://www.multiupfile.com/f/d0f2d2b6
1881c370c81e9704df9dce57093be412e834485e ticket-21330_r47693_openwrt-ar71xx-generic-tl-wr741nd-v4-squashfs.7z

else try:

r47751 - http://www.multiupfile.com/f/ca312016
7637ed62b4f8399eda81ff4733811b223e2514a0 ticket-21330_r47751_openwrt-ar71xx-generic-tl-wr741nd-v4-squashfs.7z

and share the results.

Last edited 2 years ago by George-TL (previous) (diff)

comment:10 in reply to: ↑ 9 Changed 2 years ago by anonymous

Replying to George-TL:
Can I just do "binary search" to dig out the memleak changeset, because I'm already doing this.
r47775 bad
r47744 testing
r47713 good
r47694 good

comment:11 Changed 2 years ago by anonymous

Now, I'm pretty sure it's the r47771 cause memory leak.

comment:12 follow-up: Changed 2 years ago by papaj0e

I can confirm that the very same issue is observed with DIR-825 running trunk versions above r47770.
It cannot work more than two days without restart, because the memory gets exhausted.

comment:13 in reply to: ↑ 12 Changed 2 years ago by anonymous

Replying to papaj0e:

I can confirm that the very same issue is observed with DIR-825 running trunk versions above r47770.
It cannot work more than two days without restart, because the memory gets exhausted.

Have you tried the lastest version ? Seems fixed.

comment:14 follow-up: Changed 2 years ago by papaj0e

The latest version I did try was r47953 and the issue was still present - afterwards I reverted to r47770.

In r47770 there is also some (modest) memory leak in ubusd process - after about 4 days uptime the used memory jumped from 19mb to 24mb but still nowhere near the memory leak with newer builds.

With which commit you suggest the memory leak has been fixed?

comment:15 in reply to: ↑ 14 Changed 2 years ago by anonymous

Replying to papaj0e:

The latest version I did try was r47953 and the issue was still present - afterwards I reverted to r47770.

In r47770 there is also some (modest) memory leak in ubusd process - after about 4 days uptime the used memory jumped from 19mb to 24mb but still nowhere near the memory leak with newer builds.

With which commit you suggest the memory leak has been fixed?

r47958

comment:16 Changed 2 years ago by papaj0e

Okay, thanks for the suggestion - I will try the latest trunk and report back my findings later.

comment:17 Changed 2 years ago by papaj0e

After two days running the latest trunk (r48005) it seems that the huge memory leak has been fixed with commit r47958.
Thank you for bringing this up to my attention!

There is still some modest memory leak observed in /sbin/ubusd though - it started with 1.7% memory usage and now is at 3.2% and keeps growing slowly. Anyway this doesn't seem to affect the operation of the router for the time being - we will see how it will behave in the next few days or so.

comment:18 Changed 2 years ago by papaj0e

After almost 5 days uptime everything seems stable here.

The memory leak in /sbin/ubusd is observed when LuCI status page is open - now the memory usage of this process (shown in htop) is at 5.4% and it stays there when LuCI auto-refresh is turned off.

comment:19 Changed 2 years ago by nbd

  • Resolution set to fixed
  • Status changed from new to closed

ubusd memory leak fixed in r48505

Add Comment

Modify Ticket

Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.