Modify

Opened 7 years ago

Last modified 3 years ago

#9304 reopened defect

High constant upload traffic causes out-of-memory condition and crash

Reported by: johnc60@… Owned by: developers
Priority: high Milestone: Chaos Calmer 15.05
Component: base system Version: Backfire 10.03.1 RC4
Keywords: upstream upload out-of-memory oom crashes Cc:

Description

When performing uploads from LAN to WAN (posting a Facebook video, for example) the OpenWrt router will rapidly run out of memory. The OOM procedure takes over and starts to kill processes to recover memory. Eventually the router will stop working, the watchdog timer will expire and the router will reboot itself (fortunately).

I have a 6M/768K DSL connection and cannot imagine that the router is not able to handle a constant 768Kb/s of upstream traffic. Downstream traffic (WAN to LAN) is no problem. I can download all day long at 6Mb/s.

There must be a memory leak somewhere in the upstream path. If the upstream traffic stops, the free memory does not return to its original value. So, one long upload or many short uploads will eventually consume all of the free memory and crash the router. Not sure why there is a difference in behavior between upload versus download traffic.

The hardware is an Airlink AR430W (Atheros). The problem exists in all versions of OpenWrt - Kamikaze, Backfire and Attitude Adjustment (trunk). Kamikaze seem to lose memory a little more slowly.

Test scenario: Load router with Backfire RC5. Perform a large (200-400MB) HTTP or FTP upload from a LAN connected computer to the Internet connected to the WAN. AT 768Kb/s upstream speed, the router will crash within ten minutes.

-John

Attachments (0)

Change History (56)

comment:1 Changed 7 years ago by anonymous

Additional Information: Memory information before uploading 186MB file.

root@xtinkerbell:~# free

total used free shared buffers

Mem: 13136 11752 1384 0 1220

Swap: 0 0 0

Total: 13136 11752 1384
root@xtinkerbell:~# cat /proc/meminfo
MemTotal: 13136 kB
MemFree: 1376 kB
Buffers: 1220 kB
Cached: 4268 kB
SwapCached: 0 kB
Active: 2700 kB
Inactive: 3456 kB
Active(anon): 780 kB
Inactive(anon): 0 kB
Active(file): 1920 kB
Inactive(file): 3456 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 688 kB
Mapped: 980 kB
Slab: 3760 kB
SReclaimable: 692 kB
SUnreclaim: 3068 kB
PageTables: 120 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 6568 kB
Committed_AS: 2120 kB
VmallocTotal: 1048404 kB
VmallocUsed: 616 kB
VmallocChunk: 1037460 kB
root@xtinkerbell:~# cat /proc/slabinfo
slabinfo - version: 2.1
# name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
nf_conntrack_expect 0 0 144 27 1 : tunables 120 60 0 : slabdata 0 0 0
nf_conntrack 119 119 224 17 1 : tunables 120 60 0 : slabdata 7 7 0
bridge_fdb_cache 7 59 64 59 1 : tunables 120 60 0 : slabdata 1 1 0
flow_cache 0 0 80 48 1 : tunables 120 60 0 : slabdata 0 0 0
jffs2_inode_cache 25 145 24 145 1 : tunables 120 60 0 : slabdata 1 1 0
jffs2_node_frag 310 435 24 145 1 : tunables 120 60 0 : slabdata 3 3 0
jffs2_refblock 39 48 248 16 1 : tunables 120 60 0 : slabdata 3 3 0
jffs2_tmp_dnode 0 0 32 113 1 : tunables 120 60 0 : slabdata 0 0 0
jffs2_raw_inode 0 0 68 56 1 : tunables 120 60 0 : slabdata 0 0 0
jffs2_raw_dirent 0 0 40 92 1 : tunables 120 60 0 : slabdata 0 0 0
jffs2_full_dnode 319 406 16 203 1 : tunables 120 60 0 : slabdata 2 2 0
jffs2_i 23 24 312 12 1 : tunables 54 27 0 : slabdata 2 2 0
squashfs_inode_cache 273 276 320 12 1 : tunables 54 27 0 : slabdata 23 23 0
configfs_dir_cache 0 0 52 72 1 : tunables 120 60 0 : slabdata 0 0 0
kioctx 0 0 160 24 1 : tunables 120 60 0 : slabdata 0 0 0
kiocb 0 0 160 24 1 : tunables 120 60 0 : slabdata 0 0 0
fasync_cache 0 0 16 203 1 : tunables 120 60 0 : slabdata 0 0 0
shmem_inode_cache 69 70 368 10 1 : tunables 54 27 0 : slabdata 7 7 0
nsproxy 0 0 24 145 1 : tunables 120 60 0 : slabdata 0 0 0
posix_timers_cache 0 0 112 35 1 : tunables 120 60 0 : slabdata 0 0 0
uid_cache 1 59 64 59 1 : tunables 120 60 0 : slabdata 1 1 0
UNIX 6 10 384 10 1 : tunables 54 27 0 : slabdata 1 1 0
ip_mrt_cache 0 0 96 40 1 : tunables 120 60 0 : slabdata 0 0 0
UDP-Lite 0 0 480 8 1 : tunables 54 27 0 : slabdata 0 0 0
tcp_bind_bucket 2 113 32 113 1 : tunables 120 60 0 : slabdata 1 1 0
inet_peer_cache 10 59 64 59 1 : tunables 120 60 0 : slabdata 1 1 0
secpath_cache 0 0 32 113 1 : tunables 120 60 0 : slabdata 0 0 0
xfrm_dst_cache 0 0 288 13 1 : tunables 54 27 0 : slabdata 0 0 0
ip_fib_alias 0 0 16 203 1 : tunables 120 60 0 : slabdata 0 0 0
ip_fib_hash 13 101 36 101 1 : tunables 120 60 0 : slabdata 1 1 0
ip_dst_cache 73 75 256 15 1 : tunables 120 60 0 : slabdata 5 5 0
arp_cache 2 30 128 30 1 : tunables 120 60 0 : slabdata 1 1 0
RAW 2 8 480 8 1 : tunables 54 27 0 : slabdata 1 1 0
UDP 5 8 480 8 1 : tunables 54 27 0 : slabdata 1 1 0
tw_sock_TCP 0 0 96 40 1 : tunables 120 60 0 : slabdata 0 0 0
request_sock_TCP 0 0 96 40 1 : tunables 120 60 0 : slabdata 0 0 0
TCP 2 3 1056 3 1 : tunables 24 12 0 : slabdata 1 1 0
eventpoll_pwq 0 0 36 101 1 : tunables 120 60 0 : slabdata 0 0 0
eventpoll_epi 0 0 96 40 1 : tunables 120 60 0 : slabdata 0 0 0
blkdev_queue 1 3 1256 3 1 : tunables 24 12 0 : slabdata 1 1 0
blkdev_requests 4 17 224 17 1 : tunables 120 60 0 : slabdata 1 1 0
blkdev_ioc 0 0 44 84 1 : tunables 120 60 0 : slabdata 0 0 0
bio-0 2 30 128 30 1 : tunables 120 60 0 : slabdata 1 1 0
biovec-256 2 2 3072 1 1 : tunables 24 12 0 : slabdata 2 2 0
biovec-128 0 0 1536 2 1 : tunables 24 12 0 : slabdata 0 0 0
biovec-64 0 0 768 5 1 : tunables 54 27 0 : slabdata 0 0 0
biovec-16 0 0 192 20 1 : tunables 120 60 0 : slabdata 0 0 0
sock_inode_cache 25 36 320 12 1 : tunables 54 27 0 : slabdata 3 3 0
skbuff_cb_store_cache 0 0 64 59 1 : tunables 120 60 0 : slabdata 0 0 0
skbuff_fclone_cache 0 0 384 10 1 : tunables 54 27 0 : slabdata 0 0 0
skbuff_head_cache 200 200 192 20 1 : tunables 120 60 0 : slabdata 10 10 0
file_lock_cache 0 0 96 40 1 : tunables 120 60 0 : slabdata 0 0 0
proc_inode_cache 127 143 296 13 1 : tunables 54 27 0 : slabdata 11 11 0
sigqueue 0 0 144 27 1 : tunables 120 60 0 : slabdata 0 0 0
radix_tree_node 83 91 288 13 1 : tunables 54 27 0 : slabdata 7 7 0
bdev_cache 3 9 416 9 1 : tunables 54 27 0 : slabdata 1 1 0
sysfs_dir_cache 1589 1596 44 84 1 : tunables 120 60 0 : slabdata 19 19 0
mnt_cache 19 30 128 30 1 : tunables 120 60 0 : slabdata 1 1 0
filp 145 210 128 30 1 : tunables 120 60 0 : slabdata 7 7 0
inode_cache 590 602 272 14 1 : tunables 54 27 0 : slabdata 43 43 0
dentry 1856 1860 128 30 1 : tunables 120 60 0 : slabdata 62 62 0
names_cache 1 1 4096 1 1 : tunables 24 12 0 : slabdata 1 1 0
buffer_head 1220 1239 64 59 1 : tunables 120 60 0 : slabdata 21 21 0
vm_area_struct 222 460 84 46 1 : tunables 120 60 0 : slabdata 10 10 0
mm_struct 30 30 384 10 1 : tunables 54 27 0 : slabdata 3 3 0
fs_cache 16 113 32 113 1 : tunables 120 60 0 : slabdata 1 1 0
files_cache 17 40 192 20 1 : tunables 120 60 0 : slabdata 2 2 0
signal_cache 23 32 480 8 1 : tunables 54 27 0 : slabdata 4 4 0
sighand_cache 23 23 3104 1 1 : tunables 24 12 0 : slabdata 23 23 0
task_struct 25 30 1152 3 1 : tunables 24 12 0 : slabdata 10 10 0
cred_jar 46 120 96 40 1 : tunables 120 60 0 : slabdata 3 3 0
anon_vma 124 339 8 339 1 : tunables 120 60 0 : slabdata 1 1 0
pid 34 59 64 59 1 : tunables 120 60 0 : slabdata 1 1 0
idr_layer_cache 82 104 148 26 1 : tunables 120 60 0 : slabdata 4 4 0
size-131072 0 0 131072 1 32 : tunables 8 4 0 : slabdata 0 0 0
size-65536 5 5 65536 1 16 : tunables 8 4 0 : slabdata 5 5 0
size-32768 3 3 32768 1 8 : tunables 8 4 0 : slabdata 3 3 0
size-16384 3 3 16384 1 4 : tunables 8 4 0 : slabdata 3 3 0
size-8192 0 0 8192 1 2 : tunables 8 4 0 : slabdata 0 0 0
size-4096 286 286 4096 1 1 : tunables 24 12 0 : slabdata 286 286 0
size-2048 15 16 2048 2 1 : tunables 24 12 0 : slabdata 8 8 0
size-1024 39 44 1024 4 1 : tunables 54 27 0 : slabdata 11 11 0
size-512 240 248 512 8 1 : tunables 54 27 0 : slabdata 31 31 0
size-256 55 60 256 15 1 : tunables 120 60 0 : slabdata 4 4 0
size-192 102 105 256 15 1 : tunables 120 60 0 : slabdata 7 7 0
size-128 221 240 128 30 1 : tunables 120 60 0 : slabdata 8 8 0
size-96 260 270 128 30 1 : tunables 120 60 0 : slabdata 9 9 0
size-64 1002 1020 128 30 1 : tunables 120 60 0 : slabdata 34 34 0
size-32 3288 3300 128 30 1 : tunables 120 60 0 : slabdata 110 110 0
kmem_cache 89 120 96 40 1 : tunables 120 60 0 : slabdata 3 3 0
root@xtinkerbell:~# free

total used free shared buffers

Mem: 13136 11764 1372 0 1220

Swap: 0 0 0

Total: 13136 11764 1372
root@xtinkerbell:~#

comment:2 Changed 7 years ago by johnc60@…

Additional Information: Memory information during the upload of a 186MB file and eventual router crash.

root@xtinkerbell:~# free

total used free shared buffers

Mem: 13136 12256 880 0 1108

Swap: 0 0 0

Total: 13136 12256 880
root@xtinkerbell:~# free

total used free shared buffers

Mem: 13136 12320 816 0 980

Swap: 0 0 0

Total: 13136 12320 816
root@xtinkerbell:~# free

total used free shared buffers

Mem: 13136 12340 796 0 980

Swap: 0 0 0

Total: 13136 12340 796
root@xtinkerbell:~# free

total used free shared buffers

Mem: 13136 12396 740 0 980

Swap: 0 0 0

Total: 13136 12396 740
root@xtinkerbell:~# free

total used free shared buffers

Mem: 13136 12180 956 0 904

Swap: 0 0 0

Total: 13136 12180 956
root@xtinkerbell:~# free

total used free shared buffers

Mem: 13136 12176 960 0 904

Swap: 0 0 0

Total: 13136 12176 960
root@xtinkerbell:~# free

total used free shared buffers

Mem: 13136 12184 952 0 904

Swap: 0 0 0

Total: 13136 12184 952
root@xtinkerbell:~# free

total used free shared buffers

Mem: 13136 12276 860 0 904

Swap: 0 0 0

Total: 13136 12276 860
root@xtinkerbell:~# free

total used free shared buffers

Mem: 13136 12164 972 0 844

Swap: 0 0 0

Total: 13136 12164 972
root@xtinkerbell:~# free

total used free shared buffers

Mem: 13136 12444 692 0 692

Swap: 0 0 0

Total: 13136 12444 692
root@xtinkerbell:~# free

total used free shared buffers

Mem: 13136 12480 656 0 212

Swap: 0 0 0

Total: 13136 12480 656
root@xtinkerbell:~# free

total used free shared buffers

Mem: 13136 12332 804 0 196

Swap: 0 0 0

Total: 13136 12332 804
root@xtinkerbell:~# free

total used free shared buffers

Mem: 13136 12356 780 0 196

Swap: 0 0 0

Total: 13136 12356 780
root@xtinkerbell:~# free

total used free shared buffers

Mem: 13136 12468 668 0 152

Swap: 0 0 0

Total: 13136 12468 668
root@xtinkerbell:~# cat /proc/meminfo
MemTotal: 13136 kB
MemFree: 800 kB
Buffers: 156 kB
Cached: 940 kB
SwapCached: 0 kB
Active: 612 kB
Inactive: 1136 kB
Active(anon): 304 kB
Inactive(anon): 496 kB
Active(file): 308 kB
Inactive(file): 640 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 688 kB
Mapped: 624 kB
Slab: 8728 kB
SReclaimable: 340 kB
SUnreclaim: 8388 kB
PageTables: 120 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 6568 kB
Committed_AS: 2120 kB
VmallocTotal: 1048404 kB
VmallocUsed: 616 kB
VmallocChunk: 1037460 kB
root@xtinkerbell:~# cat /proc/slabinfo
slabinfo - version: 2.1
# name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
nf_conntrack_expect 0 0 144 27 1 : tunables 120 60 0 : slabdata 0 0 0
nf_conntrack 133 289 224 17 1 : tunables 120 60 0 : slabdata 17 17 0
bridge_fdb_cache 7 59 64 59 1 : tunables 120 60 0 : slabdata 1 1 0
flow_cache 0 0 80 48 1 : tunables 120 60 0 : slabdata 0 0 0
jffs2_inode_cache 25 145 24 145 1 : tunables 120 60 0 : slabdata 1 1 0
jffs2_node_frag 81 435 24 145 1 : tunables 120 60 0 : slabdata 3 3 0
jffs2_refblock 39 48 248 16 1 : tunables 120 60 0 : slabdata 3 3 0
jffs2_tmp_dnode 0 0 32 113 1 : tunables 120 60 0 : slabdata 0 0 0
jffs2_raw_inode 0 0 68 56 1 : tunables 120 60 0 : slabdata 0 0 0
jffs2_raw_dirent 0 0 40 92 1 : tunables 120 60 0 : slabdata 0 0 0
jffs2_full_dnode 87 406 16 203 1 : tunables 120 60 0 : slabdata 2 2 0
jffs2_i 14 24 312 12 1 : tunables 54 27 0 : slabdata 2 2 0
squashfs_inode_cache 98 192 320 12 1 : tunables 54 27 0 : slabdata 16 16 0
configfs_dir_cache 0 0 52 72 1 : tunables 120 60 0 : slabdata 0 0 0
kioctx 0 0 160 24 1 : tunables 120 60 0 : slabdata 0 0 0
kiocb 0 0 160 24 1 : tunables 120 60 0 : slabdata 0 0 0
fasync_cache 0 0 16 203 1 : tunables 120 60 0 : slabdata 0 0 0
shmem_inode_cache 69 70 368 10 1 : tunables 54 27 0 : slabdata 7 7 0
nsproxy 0 0 24 145 1 : tunables 120 60 0 : slabdata 0 0 0
posix_timers_cache 0 0 112 35 1 : tunables 120 60 0 : slabdata 0 0 0
uid_cache 1 59 64 59 1 : tunables 120 60 0 : slabdata 1 1 0
UNIX 6 10 384 10 1 : tunables 54 27 0 : slabdata 1 1 0
ip_mrt_cache 0 0 96 40 1 : tunables 120 60 0 : slabdata 0 0 0
UDP-Lite 0 0 480 8 1 : tunables 54 27 0 : slabdata 0 0 0
tcp_bind_bucket 2 113 32 113 1 : tunables 120 60 0 : slabdata 1 1 0
inet_peer_cache 5 59 64 59 1 : tunables 120 60 0 : slabdata 1 1 0
secpath_cache 0 0 32 113 1 : tunables 120 60 0 : slabdata 0 0 0
xfrm_dst_cache 0 0 288 13 1 : tunables 54 27 0 : slabdata 0 0 0
ip_fib_alias 0 0 16 203 1 : tunables 120 60 0 : slabdata 0 0 0
ip_fib_hash 13 101 36 101 1 : tunables 120 60 0 : slabdata 1 1 0
ip_dst_cache 112 150 256 15 1 : tunables 120 60 0 : slabdata 10 10 0
arp_cache 2 30 128 30 1 : tunables 120 60 0 : slabdata 1 1 0
RAW 2 8 480 8 1 : tunables 54 27 0 : slabdata 1 1 0
UDP 8 8 480 8 1 : tunables 54 27 0 : slabdata 1 1 0
tw_sock_TCP 0 0 96 40 1 : tunables 120 60 0 : slabdata 0 0 0
request_sock_TCP 0 0 96 40 1 : tunables 120 60 0 : slabdata 0 0 0
TCP 2 3 1056 3 1 : tunables 24 12 0 : slabdata 1 1 0
eventpoll_pwq 0 0 36 101 1 : tunables 120 60 0 : slabdata 0 0 0
eventpoll_epi 0 0 96 40 1 : tunables 120 60 0 : slabdata 0 0 0
blkdev_queue 1 3 1256 3 1 : tunables 24 12 0 : slabdata 1 1 0
blkdev_requests 10 17 224 17 1 : tunables 120 60 0 : slabdata 1 1 0
blkdev_ioc 0 0 44 84 1 : tunables 120 60 0 : slabdata 0 0 0
bio-0 36 60 128 30 1 : tunables 120 60 0 : slabdata 2 2 0
biovec-256 2 2 3072 1 1 : tunables 24 12 0 : slabdata 2 2 0
biovec-128 0 0 1536 2 1 : tunables 24 12 0 : slabdata 0 0 0
biovec-64 0 0 768 5 1 : tunables 54 27 0 : slabdata 0 0 0
biovec-16 0 0 192 20 1 : tunables 120 60 0 : slabdata 0 0 0
sock_inode_cache 23 24 320 12 1 : tunables 54 27 0 : slabdata 2 2 0
skbuff_cb_store_cache 0 0 64 59 1 : tunables 120 60 0 : slabdata 0 0 0
skbuff_fclone_cache 0 0 384 10 1 : tunables 54 27 0 : slabdata 0 0 0
skbuff_head_cache 1460 1460 192 20 1 : tunables 120 60 0 : slabdata 73 73 0
file_lock_cache 0 0 96 40 1 : tunables 120 60 0 : slabdata 0 0 0
proc_inode_cache 78 78 296 13 1 : tunables 54 27 0 : slabdata 6 6 0
sigqueue 0 0 144 27 1 : tunables 120 60 0 : slabdata 0 0 0
radix_tree_node 38 65 288 13 1 : tunables 54 27 0 : slabdata 5 5 0
bdev_cache 3 9 416 9 1 : tunables 54 27 0 : slabdata 1 1 0
sysfs_dir_cache 1589 1596 44 84 1 : tunables 120 60 0 : slabdata 19 19 0
mnt_cache 19 30 128 30 1 : tunables 120 60 0 : slabdata 1 1 0
filp 104 210 128 30 1 : tunables 120 60 0 : slabdata 7 7 0
inode_cache 136 238 272 14 1 : tunables 54 27 0 : slabdata 17 17 0
dentry 331 840 128 30 1 : tunables 120 60 0 : slabdata 28 28 0
names_cache 1 1 4096 1 1 : tunables 24 12 0 : slabdata 1 1 0
buffer_head 157 472 64 59 1 : tunables 120 60 0 : slabdata 8 8 0
vm_area_struct 254 460 84 46 1 : tunables 120 60 0 : slabdata 10 10 0
mm_struct 14 30 384 10 1 : tunables 54 27 0 : slabdata 3 3 0
fs_cache 9 113 32 113 1 : tunables 120 60 0 : slabdata 1 1 0
files_cache 10 40 192 20 1 : tunables 120 60 0 : slabdata 2 2 0
signal_cache 27 32 480 8 1 : tunables 54 27 0 : slabdata 4 4 0
sighand_cache 23 23 3104 1 1 : tunables 24 12 0 : slabdata 23 23 0
task_struct 23 30 1152 3 1 : tunables 24 12 0 : slabdata 10 10 0
cred_jar 75 120 96 40 1 : tunables 120 60 0 : slabdata 3 3 0
anon_vma 89 339 8 339 1 : tunables 120 60 0 : slabdata 1 1 0
pid 27 59 64 59 1 : tunables 120 60 0 : slabdata 1 1 0
idr_layer_cache 82 104 148 26 1 : tunables 120 60 0 : slabdata 4 4 0
size-131072 0 0 131072 1 32 : tunables 8 4 0 : slabdata 0 0 0
size-65536 5 5 65536 1 16 : tunables 8 4 0 : slabdata 5 5 0
size-32768 3 3 32768 1 8 : tunables 8 4 0 : slabdata 3 3 0
size-16384 3 3 16384 1 4 : tunables 8 4 0 : slabdata 3 3 0
size-8192 0 0 8192 1 2 : tunables 8 4 0 : slabdata 0 0 0
size-4096 1481 1496 4096 1 1 : tunables 24 12 0 : slabdata 1481 1496 0
size-2048 15 16 2048 2 1 : tunables 24 12 0 : slabdata 8 8 0
size-1024 51 56 1024 4 1 : tunables 54 27 0 : slabdata 14 14 0
size-512 254 256 512 8 1 : tunables 54 27 0 : slabdata 32 32 0
size-256 55 60 256 15 1 : tunables 120 60 0 : slabdata 4 4 0
size-192 102 105 256 15 1 : tunables 120 60 0 : slabdata 7 7 0
size-128 222 240 128 30 1 : tunables 120 60 0 : slabdata 8 8 0
size-96 336 450 128 30 1 : tunables 120 60 0 : slabdata 15 15 0
size-64 1041 1050 128 30 1 : tunables 120 60 0 : slabdata 35 35 0
size-32 3778 3810 128 30 1 : tunables 120 60 0 : slabdata 127 127 0
kmem_cache 89 120 96 40 1 : tunables 120 60 0 : slabdata 3 3 0
root@xtinkerbell:~# ps

PID USER VSZ STAT COMMAND

1 root 1364 S init
2 root 0 SW< [kthreadd]
3 root 0 SW< [ksoftirqd/0]
4 root 0 SW< [events/0]
5 root 0 SW< [khelper]
8 root 0 SW< [async/mgr]

26 root 0 SW< [kblockd/0]
53 root 0 SW [pdflush]
54 root 0 SW [pdflush]
55 root 0 SW< [kswapd0]
56 root 0 SW< [aio/0]
57 root 0 SW< [crypto/0]

110 root 0 SW< [mtdblockd]
280 root 0 SWN [jffs2_gcd_mtd3]
295 root 1364 S /bin/ash --login
305 root 1416 S syslogd -C64
307 root 1352 S klogd
321 root 780 S /sbin/hotplug2 --override --persistent --set-worker /
871 root 1460 S hostapd -P /var/run/wifi-ath1.pid -B /var/run/hostapd

1131 root 1144 S /usr/sbin/dropbear -p 222 -b /etc/banner -P /var/run/
1157 nobody 924 S /usr/sbin/dnsmasq -K -D -N -q -R -y -Z -E -s chongfam
1165 root 1360 S watchdog -t 5 /dev/watchdog

1222 root 1356 R ps
root@xtinkerbell:~# free

total used free shared buffers

Mem: 13136 12348 788 0 192

Swap: 0 0 0

Total: 13136 12348 788
root@xtinkerbell:~# free

total used free shared buffers

Mem: 13136 12360 776 0 192

Swap: 0 0 0

Total: 13136 12360 776
root@xtinkerbell:~# free

total used free shared buffers

Mem: 13136 12384 752 0 192

Swap: 0 0 0

Total: 13136 12384 752
root@xtinkerbell:~# free

<router hung>

watchdog expired, rebooting system

comment:3 Changed 7 years ago by jow

On what device does this happen? Did you install additional packages such as layer7 modules?

comment:4 Changed 7 years ago by johnc60@…

This is on an Airlink AR430W Atheros 2318 based router. It's similar to the Dlink DIR-300. 4MB Flash and 16MB RAM.

I installed the Atheros Backfire 10.03.1-RC5 package straight from the OpenWrt download site. No additional packages installed. Configuration is pretty basic. Switch ports 0-3 and WiFi on the LAN. Switch port 4 on the WAN and is connected to a DSL modem with a 6M/768K circuit.

I ran Kamikaze on this box for a long time (the build number was 15xxx). It would "randomly" crash when I upload anything substantial. I recently upgraded to Backfire hoping that the problem was fixed but instead the problem was worse. On the serial console and I can see that the free memory and buffers were disappearing rapidly when I uploaded.

I also built images from the latest Attitude Adjustment trunk with the same results. Download is no problem. Only uploads consume memory like crazy.

-John

comment:5 Changed 7 years ago by johnc60@…

Forgot to add that the network configuration is your basic residential router. Private IP addressing on the br-lan, one public IP address on the br-wan. iptables doing NAT with PAT (masqurade). WiFi is WPA2+AES.

Nothing fancy.

-John

comment:6 Changed 7 years ago by nbd

Are you using qos?

comment:7 Changed 7 years ago by johnc60@…

No. I am not using QoS. Here's the router config.

-John

root@xtinkerbell:/etc# cat banner

_

| |.-----.-----.-----.| | | |.----.| |_

| - |_
_ | -| | | _ _|
|_||| ||

|| W I R E L E S S F R E E D O M

Backfire (10.03.1-RC5, r26686) --------------------------

  • 1/3 shot Kahlua In a shot glass, layer Kahlua
  • 1/3 shot Bailey's on the bottom, then Bailey's,
  • 1/3 shot Vodka then Vodka.

---------------------------------------------------




root@xtinkerbell:/etc/config# cat system

config system

option hostname xtinkerbell
option timezone "PST8PDT,M3.2.0,M11.1.0"
option log_size 64

config button

option button reset
option action released
option handler "logger reboot"
option min 0
option max 4

config button

option button reset
option action released
option handler "logger factory default"
option min 5
option max 30




root@xtinkerbell:/etc/config# cat network
config switch eth0

option reset 1
option enable_vlan 1

config switch_vlan eth0_1

option device eth0
option vlan 1
option ports '0 1 5t'

config switch_vlan eth0_2

option device eth0
option vlan 2
option ports '2 3 4 5t'

config interface loopback

option ifname lo
option proto static
option ipaddr 127.0.0.1
option netmask 255.0.0.0

config interface lan

option ifname eth0.1
option type bridge
option mtu 1500
option proto static
option ipaddr 192.168.8.231
option netmask 255.255.255.0

config interface wan

option ifname eth0.2
option type bridge
option mtu 1500
option proto static
option ipaddr 12.34.56.78
option netmask 255.255.255.0
option gateway 12.34.56.1
option dns "208.67.222.222 208.67.220.220"




root@xtinkerbell:/etc/config# cat wireless
config wifi-device wifi0

option type atheros
option channel 11
option agmode 11bg
option diversity 0
option rxantenna 2
option txantenna 2
option softled 1
option disabled 0

config wifi-iface

option device wifi0
option mode ap
option ssid 'xTinkerbell01'
option encryption wep+auto
option key 00000000000000000000000000
option network lan
option ar 1
option bgscan 0
option burst 1
option comp 0
option ff 1
option turbo 0
option xr 1

config wifi-iface

option device wifi0
option mode ap
option ssid 'xTinkerbell02'
option encryption psk2
option key 00000000000000000000000000000000
option network lan
option ar 1
option bgscan 0
option burst 1
option comp 0
option ff 1
option turbo 0
option xr 1




root@xtinkerbell:/etc/config# cat firewall
config defaults

option syn_flood 1
option input ACCEPT
option output ACCEPT
option forward REJECT

config zone

option name lan
option input ACCEPT
option output ACCEPT
option forward REJECT

config zone

option name wan
option input REJECT
option output ACCEPT
option forward REJECT
option masq 1

config forwarding

option src lan
option dest wan
option mtu_fix 1

config rule

option src wan
option proto icmp
option target ACCEPT

config rule

option src wan
option proto tcp
option dest_ip 12.34.56.78
option dest_port 222
option target ACCEPT


config rule

option src wan
option proto tcp
option dest_ip 12.34.56.78
option dest_port 1723
option target ACCEPT


config rule

option src wan
option proto gre
option dest_ip 12.34.56.78
option target ACCEPT


config redirect

option src wan
option src_dport 223
option dest lan
option dest_ip 192.168.8.239
option dest_port 22
option proto tcp

config redirect

option src wan
option src_dport 3382
option dest lan
option dest_ip 192.168.8.102
option dest_port 3389
option proto tcp

config redirect

option src wan
option src_dport 3381
option dest lan
option dest_ip 192.168.8.11
option dest_port 3389
option proto tcp

config redirect

option src wan
option src_dport 3388
option dest lan
option dest_ip 192.168.8.183
option dest_port 3389
option proto tcp

config include

option path /etc/firewall.user




root@xtinkerbell:/etc# cat firewall.user
iptables -t mangle -A POSTROUTING -p tcp --tcp-flags SYN,RST SYN -o tun0 -j TCPMSS --set-mss 1260

root@xtinkerbell:/etc/config# cat dhcp
config dnsmasq

option logqueries 1
option domainneeded 1
option boguspriv 0
option filterwin2k 0
option localise_queries 1
option domain 'yyy.local'
option local '/yyy.local/ -S /zzz.com/10.16.65.110@tun0 -S /zzz.com/208.67.222.222 -S 208.67.222.222 -S 208.67.220.220'
option expandhosts 1
option nonegcache 1
option authoritative 1
option readethers 1
option noresolv 1
option resolvfile '/tmp/resolv.conf.auto'
option leasefile '/tmp/dhcp.leases'
option rebind_protection 1
option rebind_localhost 0
list rebind_domain '/zzz.com/'

config dhcp lan

option interface lan
option ignore 0
option start 180
option limit 20
option leasetime 24h
option dhcp_option 'lan,3,192.168.8.231'

config dhcp wan

option interface wan
option ignore 1

root@xtinkerbell:/etc# ifconfig
ath0 Link encap:Ethernet HWaddr 00:1D:6A:DC:F8:07

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:245 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:36655 (35.7 KiB)

ath1 Link encap:Ethernet HWaddr 0A:1D:6A:DC:F8:07

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:482 errors:0 dropped:482 overruns:0 frame:0
TX packets:466 errors:0 dropped:8 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:60535 (59.1 KiB) TX bytes:170414 (166.4 KiB)

br-lan Link encap:Ethernet HWaddr 00:1D:6A:DC:F8:07

inet addr:192.168.8.231 Bcast:192.168.8.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:482 errors:0 dropped:0 overruns:0 frame:0
TX packets:303 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:60535 (59.1 KiB) TX bytes:137675 (134.4 KiB)

br-wan Link encap:Ethernet HWaddr 00:1D:6A:DC:F8:08

inet addr:12.34.56.78 Bcast:12.34.56.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1868 errors:0 dropped:0 overruns:0 frame:0
TX packets:799 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:225978 (220.6 KiB) TX bytes:101841 (99.4 KiB)

eth0 Link encap:Ethernet HWaddr 00:1D:6A:DC:F8:08

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1872 errors:0 dropped:0 overruns:0 frame:0
TX packets:1045 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:260872 (254.7 KiB) TX bytes:143715 (140.3 KiB)
Interrupt:4 Base address:0x1000

eth0.1 Link encap:Ethernet HWaddr 00:1D:6A:DC:F8:08

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:245 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 B) TX bytes:37635 (36.7 KiB)

eth0.2 Link encap:Ethernet HWaddr 00:1D:6A:DC:F8:08

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1869 errors:0 dropped:0 overruns:0 frame:0
TX packets:799 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:233543 (228.0 KiB) TX bytes:105037 (102.5 KiB)

lo Link encap:Local Loopback

inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:2 errors:0 dropped:0 overruns:0 frame:0
TX packets:2 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:176 (176.0 B) TX bytes:176 (176.0 B)

wifi0 Link encap:UNSPEC HWaddr 00-1D-6A-DC-F8-07-00-00-00-00-00-00-00-00-00-00

UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:455190 errors:0 dropped:0 overruns:0 frame:175191
TX packets:12673 errors:3505 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:195
RX bytes:81823593 (78.0 MiB) TX bytes:2032726 (1.9 MiB)
Interrupt:3 Memory:b0000000-b000ffff

comment:8 Changed 7 years ago by JohnC60 <johnc60@…>

This memory issue may be related to some sort of backpressure caused by the speed difference between the LAN and WAN. In this original setup:

FTPsvr1----DSL(6M/768K)----DSLModem----xTinkerbell----TestPC

File uploads from TestPC to FTPsvr1 will always exhaust the free memory of router xTinkerbell in about ten minutes. As a test, I changed the physical setup and added a switch and a local FTP server as follows. The original DSL path to the Internet is still there, through the new switch.

Switch1P1----xTinkerbell----TestPC
Switch1P2----DSLModem----DSL(6M/768K)----FTPsvr1
Switch1P3----Router2----FTPsvr2

In this setup, file uploads from TestPC and FTPsvr1 still consistently cause the router xTinkerbell to run out of memory. However, the same file transfer from TestPC to FTPsvr2, which has a full 100M path, does not affect xTinkerbell's memory consumption at all. This may also be the reason why downloads from FTPsvr1 to TestPC are not a problem, as traffic goes from the slower WAN to LAN.

All connections are 100Mb/s full duplex Ethernet, except for the DSL which is 6M/768K. Since the xTinkerbell sees a 100M connection in either case, I don't see how the slower DSL connection in the middle can affect the router when uploading. But clearly, the testing shows that it does.

Any thoughts or additional data to collect?

-John

comment:9 Changed 7 years ago by JohnC60 <johnc60@…>

Reformating the topology...


Switch1P1----xTinkerbell----TestPC
Switch1P2----DSLModem----DSL(6M/768K)----FTPsvr1
Switch1P3----Router2----FTPsvr2

comment:10 Changed 7 years ago by vaden@…

What does your ftp client on TestPC say about the data rate being achieved in the case which exhausts memory on the router xTinkerBell?

comment:11 Changed 7 years ago by JohnC60 <johnc60@…>

The FTP client shows it achieving the full upstream rate at about 70-75KB/s. Very consistent transfer, that is until the router crashes. :)

-John

comment:12 Changed 7 years ago by vaden@…

Can you cause the problem with a different ftp client and/or a different client OS?

Can you cause the problem with an http upload?

comment:12 Changed 7 years ago by vaden@…

Can you cause the problem with a different ftp client and/or a different client OS?

Can you cause the problem with an http upload?

comment:13 Changed 7 years ago by JohnC60 <johnc60@…>

Yes. I orginally ran into this issue with HTTP. Uploading videos to Facebook, for example, with HTTP/HTTPS. The video would upload part way and get stuck. That's when I find that the router had crashed and rebooted. Also had the same problem with bit-torrent when uploads get busy. Never a problem with downloads of any kind.

To simplify my troubleshooting, I am using FTP for testing. I can duplicate the problem with several FTP clients on both Windows and Linux.

comment:14 Changed 7 years ago by vaden@…

Do you have a 2nd Airlink AR430W on which you can duplicate the problem?

comment:15 Changed 7 years ago by JohnC60 <johnc60@…>

Yes. I can duplicated this issue on two other AR430W routers.

comment:16 Changed 7 years ago by vaden@…

If I haven't worn out my welcome, what happens if you run

'iperf -c FTPsvr1 -u -i 1 -t 1000 -b 1M' on TestPC?

Does it crash the Airlink AR430W?

If no, same question for -b 25M and for -b 75M.

Are the data rates reported by iperf sane?

For any -b which crashes the AR430W, does it crash the AR430W at the same point each time iperf is run?

Does the AR430W crash in an all FastE environment (read: kill the wireless)?

comment:17 Changed 7 years ago by JohnC60 <johnc60@…>

Here's the iperf results from TestPC to FTPsvr1.

Below "-b 650K", there is not a problem. Although I had not run the tests for more than 30 minutes.

At 670K, we start seeing a small memory loss after about 10 minutes of steady traffic. The loss is in the single digits every few minutes. I didn't have the patience to wait and see when it would finally run out.

At 680K, the loss begins consistently after about 60-70 seconds of transfer, and then loses memory at a moderate rate (double digits) until memory is gone in about 8 minutes.

At 700K, the loss begins after about 30 seconds and like the 680K scenario, loses memory at a faster rate and is out of memory in about 5 minutes.

At "-b 1M", the memory loss begins immediately and the memory is gone in about 20-30 seconds.

In all cases, once memory is depleted, the "oom-killer" procedure kicks in and starts terminating a lot of processes eventually hanging the router. The watchdog timer expires and the router reboots.

From TestPC to FTPsvr2, which has a FE connection along the entire path, iperf is good up to about 10M with no memory or performance loss. Beyond 10M, there is still no memory loss but there are performance issues because the AR430W running OpenWrt is only capable of about 11Mb/s of routing throughput because of 100% SoftIRQ. Maybe we can attack this after solving the memory issue. :) The stock firmware has much better performance than OpenWrt (either 20M or 40M thorughput, I forget which).

Back to the memory loss, I still have a problem with understanding how the router (xTinkerbell) has the ability to know or react to the 768K limit of the DSL. As far as it is concerned, it is FE connected to the switch and the both the DSLModem and Router2 are also FE connected. Show how is xTinkerbell behaving differently between sending traffic to the DSLModem versus Router2? Especially, with UDP traffic?

Wireless makes no difference. I get the same results regardless of whether TestPC is FE connected or WiFi connected to xTinkerbell.

I had also disabled the firewall and dnsmasq (removed all *ipt* and dnsmasq packages) so that xTinkerbell is reduced to simply routing between the LAN and WAN without NAT and stateful connection tracking.

comment:18 follow-up: Changed 7 years ago by JohnC60 <johnc60@…>

Any other ideas on what to look for?

comment:19 Changed 7 years ago by vaden@…

The spec sheet at <http://www.airlink101.com/products/ar430w.php> doesn't offer much information.

So, please post the output of 'dmesg' immediately after booting the router.

Perhaps the chip set used or the market share of the router doesn't warrant the interest of developers or it is so constrained with regard to memory that, again, it is not interesting.

comment:20 follow-up: Changed 7 years ago by JohnC60 <johnc60@…>

The Airlink AR430W is similar to the D-Link DIR-300 and is probably similar to dozens of other 4-port wireless routers. It is your basic Atheros reference design.

The major hardware components are:

Atheros AR2318 WiSoC
IC Plus IP175C 5-port switch
4 MB flash memory
16 MB RAM

There's plenty of free memory. This router is extremely stable on OpenWrt, running for months without rebooting. It only gets into trouble when I upload a large amount of data at once from LAN to WAN, which seems to trigger a memory leak and crashes the router. If this upload issue can be identified and fixed, this would be a problem-free router.

Here is some info from the router (dmesg, free, df, ps):

root@xtinkerbell:/# dmesg
Linux version 2.6.30.10 (openwrt@OpenWRTBuild) (gcc version 4.3.3 (GCC) ) #2 Fri Apr 15 14:10:17 MST 2011
console [early0] enabled
CPU revision is: 00019064 (MIPS 4KEc)
Determined physical RAM map:

memory: 01000000 @ 00000000 (usable)

Initrd not found or empty - disabling initrd
Zone PFN ranges:

Normal 0x00000000 -> 0x00001000

Movable zone start PFN for each node
early_node_map[1] active PFN ranges

0: 0x00000000 -> 0x00001000

On node 0 totalpages: 4096
free_area_init_node: node 0, pgdat 802c9190, node_mem_map 80326000

Normal zone: 32 pages used for memmap
Normal zone: 0 pages reserved
Normal zone: 4064 pages, LIFO batch:0

Built 1 zonelists in Zone order, mobility grouping off. Total pages: 4064
Kernel command line: console=ttyS0,9600 rootfstype=squashfs,jffs2
Primary instruction cache 16kB, VIPT, 4-way, linesize 16 bytes.
Primary data cache 16kB, 4-way, VIPT, no aliases, linesize 16 bytes
NR_IRQS:128
PID hash table entries: 64 (order: 6, 256 bytes)
console handover: boot [early0] -> real [ttyS0]
Dentry cache hash table entries: 2048 (order: 1, 8192 bytes)
Inode-cache hash table entries: 1024 (order: 0, 4096 bytes)
Memory: 13000k/16384k available (2180k kernel code, 3384k reserved, 416k data, 136k init, 0k highmem)
Calibrating delay loop... 183.50 BogoMIPS (lpj=917504)
Mount-cache hash table entries: 512
net_namespace: 1008 bytes
NET: Registered protocol family 16
bio: create slab <bio-0> at 0
Switched to high resolution mode on CPU 0
NET: Registered protocol family 2
IP route cache hash table entries: 1024 (order: 0, 4096 bytes)
TCP established hash table entries: 512 (order: 0, 4096 bytes)
TCP bind hash table entries: 512 (order: -1, 2048 bytes)
TCP: Hash tables configured (established 512 bind 512)
TCP reno registered
NET: Registered protocol family 1
Radio config found at offset 0xf8(0x1f8)
squashfs: version 4.0 (2009/01/31) Phillip Lougher
Registering mini_fo version $Id$
JFFS2 version 2.2. (NAND) (SUMMARY) © 2001-2006 Red Hat, Inc.
msgmni has been set to 25
alg: No test for stdrng (krng)
io scheduler noop registered
io scheduler deadline registered (default)
gpiodev: gpio device registered with major 254
gpiodev: gpio platform device registered with access mask FFFFFFFF
Serial: 8250/16550 driver, 1 ports, IRQ sharing disabled
serial8250: ttyS0 at MMIO 0xb1100003 (irq = 37) is a 16550A
eth0: Atheros AR231x: 00:1d:6a:dc:f8:08, irq 4
IP17xx: Found IP175C at 0:00
ar231x_eth_mii: probed
eth0: attached PHY driver [IC+ IP17xx] (mii_bus:phy_addr=0:00)
cmdlinepart partition parsing not available
Searching for RedBoot partition table in spiflash at offset 0x3d0000
Searching for RedBoot partition table in spiflash at offset 0x3e0000
6 RedBoot partitions found on MTD device spiflash
Creating 6 MTD partitions on "spiflash":
0x000000000000-0x000000030000 : "RedBoot"
0x000000030000-0x000000110000 : "vmlinux.bin.l7"
0x000000110000-0x0000003e0000 : "rootfs"
mtd: partition "rootfs" set to be root filesystem
mtd: partition "rootfs_data" created automatically, ofs=260000, len=180000
0x000000260000-0x0000003e0000 : "rootfs_data"
0x0000003e0000-0x0000003ef000 : "FIS directory"
0x0000003ef000-0x0000003f0000 : "RedBoot config"
0x0000003f0000-0x000000400000 : "boardconfig"
TCP westwood registered
NET: Registered protocol family 17
802.1Q VLAN Support v1.8 Ben Greear <greearb@…>
All bugs added by David S. Miller <davem@…>
VFS: Mounted root (squashfs filesystem) readonly on device 31:2.
Freeing unused kernel memory: 136k freed
Please be patient, while OpenWrt loads ...
mini_fo: using base directory: /
mini_fo: using storage directory: /overlay
ath_hal: module license 'Proprietary' taints kernel.
Disabling lock debugging due to kernel taint
ath_hal: 2009-05-08 (AR5212, AR5312, RF5111, RF5112, RF2316, RF2317, REGOPS_FUNC, TX_DESC_SWAP, XR)
device eth0.1 entered promiscuous mode
device eth0 entered promiscuous mode
br-lan: port 1(eth0.1) entering forwarding state
device eth0.2 entered promiscuous mode
br-wan: port 1(eth0.2) entering forwarding state
ath_ahb: trunk
wlan: trunk
wlan: mac acl policy registered
ath_rate_minstrel: Minstrel automatic rate control algorithm 1.2 (trunk)
ath_rate_minstrel: look around rate set to 10%
ath_rate_minstrel: EWMA rolloff level set to 75%
ath_rate_minstrel: max segment size in the mrr set to 6000 us
Atheros HAL provided by OpenWrt, DD-WRT and MakSat Technologies
wifi0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
wifi0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
wifi0: turboG rates: 6Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
wifi0: H/W encryption support: WEP AES AES_CCM TKIP
ath_ahb: wifi0: Atheros 2317 WiSoC REV2: mem=0xb0000000, irq=3
IRQ 3/wifi0: IRQF_DISABLED is not guaranteed on shared IRQs
device ath0 entered promiscuous mode
br-lan: port 2(ath0) entering forwarding state
device ath1 entered promiscuous mode
br-lan: port 3(ath1) entering forwarding state
br-lan: port 3(ath1) entering disabled state
br-lan: port 3(ath1) entering forwarding state

root@xtinkerbell:/# free

total used free shared buffers

Mem: 13136 10312 2824 0 1132

Swap: 0 0 0

Total: 13136 10312 2824

root@xtinkerbell:/# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/root 1408 1408 0 100% /rom
tmpfs 6568 36 6532 1% /tmp
tmpfs 512 0 512 0% /dev
/dev/mtdblock3 1536 296 1240 19% /overlay
mini_fo:/overlay 1408 1408 0 100% /

root@xtinkerbell:/# ps

PID USER VSZ STAT COMMAND

1 root 1364 S init
2 root 0 SW< [kthreadd]
3 root 0 SW< [ksoftirqd/0]
4 root 0 SW< [events/0]
5 root 0 SW< [khelper]
8 root 0 SW< [async/mgr]

26 root 0 SW< [kblockd/0]
53 root 0 SW [pdflush]
54 root 0 SW [pdflush]
55 root 0 SW< [kswapd0]
56 root 0 SW< [aio/0]
57 root 0 SW< [crypto/0]

110 root 0 SW< [mtdblockd]
280 root 0 SWN [jffs2_gcd_mtd3]
297 root 1360 S /bin/ash --login
305 root 1416 S syslogd -C64
307 root 1352 S klogd
321 root 776 S /sbin/hotplug2 --override --persistent --set-worker /
796 root 1460 S hostapd -P /var/run/wifi-ath1.pid -B /var/run/hostapd
811 root 1144 S /usr/sbin/dropbear -p 222 -b /etc/banner -P /var/run/
837 nobody 912 S /usr/sbin/dnsmasq -K -D -N -q -R -y -Z -E -s chongfam
845 root 1360 S watchdog -t 5 /dev/watchdog
872 root 1356 R ps

root@xtinkerbell:/# opkg list_installed
base-files - 43.15-r26686
busybox - 1.15.3-2
dnsmasq - 2.55-6
dropbear - 0.52-4
hotplug2 - 1.0-beta-2
kernel - 2.6.30.10-1
kmod-madwifi - 2.6.30.10+r3314-4
libc - 0.9.30.1-43.15
libgcc - 4.3.3+cs-43.15
libnl-tiny - 0.1-1
libuci - 12012009.6-3
mtd - 13
opkg - 576-1
swconfig - 7
uci - 12012009.6-3
udevtrigger - 106-1
wireless-tools - 29-4
wpad-mini - 20110402-1

root@xtinkerbell:/# uname -a
Linux xtinkerbell 2.6.30.10 #2 Fri Apr 15 14:10:17 MST 2011 mips GNU/Linux

comment:21 Changed 7 years ago by JohnC60 <johnc60@…>

(reformatted)

root@xtinkerbell:/# dmesg

Linux version 2.6.30.10 (openwrt@OpenWRTBuild) (gcc version 4.3.3 (GCC) ) #2 Fri Apr 15 14:10:17 MST 2011

console [early0] enabled

CPU revision is: 00019064 (MIPS 4KEc)

Determined physical RAM map:

memory: 01000000 @ 00000000 (usable)

Initrd not found or empty - disabling initrd

Zone PFN ranges:

Normal 0x00000000 -> 0x00001000

Movable zone start PFN for each node

early_node_map[1] active PFN ranges

0: 0x00000000 -> 0x00001000

On node 0 totalpages: 4096

free_area_init_node: node 0, pgdat 802c9190, node_mem_map 80326000

Normal zone: 32 pages used for memmap

Normal zone: 0 pages reserved

Normal zone: 4064 pages, LIFO batch:0

Built 1 zonelists in Zone order, mobility grouping off. Total pages: 4064

Kernel command line: console=ttyS0,9600 rootfstype=squashfs,jffs2

Primary instruction cache 16kB, VIPT, 4-way, linesize 16 bytes.

Primary data cache 16kB, 4-way, VIPT, no aliases, linesize 16 bytes

NR_IRQS:128

PID hash table entries: 64 (order: 6, 256 bytes)

console handover: boot [early0] -> real [ttyS0]

Dentry cache hash table entries: 2048 (order: 1, 8192 bytes)

Inode-cache hash table entries: 1024 (order: 0, 4096 bytes)

Memory: 13000k/16384k available (2180k kernel code, 3384k reserved, 416k data, 136k init, 0k highmem)

Calibrating delay loop... 183.50 BogoMIPS (lpj=917504)

Mount-cache hash table entries: 512

net_namespace: 1008 bytes

NET: Registered protocol family 16

bio: create slab <bio-0> at 0

Switched to high resolution mode on CPU 0

NET: Registered protocol family 2

IP route cache hash table entries: 1024 (order: 0, 4096 bytes)

TCP established hash table entries: 512 (order: 0, 4096 bytes)

TCP bind hash table entries: 512 (order: -1, 2048 bytes)

TCP: Hash tables configured (established 512 bind 512)

TCP reno registered

NET: Registered protocol family 1

Radio config found at offset 0xf8(0x1f8)

squashfs: version 4.0 (2009/01/31) Phillip Lougher

Registering mini_fo version $Id$

JFFS2 version 2.2. (NAND) (SUMMARY) © 2001-2006 Red Hat, Inc.

msgmni has been set to 25

alg: No test for stdrng (krng)

io scheduler noop registered

io scheduler deadline registered (default)

gpiodev: gpio device registered with major 254

gpiodev: gpio platform device registered with access mask FFFFFFFF

Serial: 8250/16550 driver, 1 ports, IRQ sharing disabled

serial8250: ttyS0 at MMIO 0xb1100003 (irq = 37) is a 16550A

eth0: Atheros AR231x: 00:1d:6a:dc:f8:08, irq 4

IP17xx: Found IP175C at 0:00

ar231x_eth_mii: probed

eth0: attached PHY driver [IC+ IP17xx] (mii_bus:phy_addr=0:00)

cmdlinepart partition parsing not available

Searching for RedBoot partition table in spiflash at offset 0x3d0000

Searching for RedBoot partition table in spiflash at offset 0x3e0000

6 RedBoot partitions found on MTD device spiflash

Creating 6 MTD partitions on "spiflash":

0x000000000000-0x000000030000 : "RedBoot"

0x000000030000-0x000000110000 : "vmlinux.bin.l7"

0x000000110000-0x0000003e0000 : "rootfs"

mtd: partition "rootfs" set to be root filesystem

mtd: partition "rootfs_data" created automatically, ofs=260000, len=180000

0x000000260000-0x0000003e0000 : "rootfs_data"

0x0000003e0000-0x0000003ef000 : "FIS directory"

0x0000003ef000-0x0000003f0000 : "RedBoot config"

0x0000003f0000-0x000000400000 : "boardconfig"

TCP westwood registered

NET: Registered protocol family 17

802.1Q VLAN Support v1.8 Ben Greear <greearb@…>

All bugs added by David S. Miller <davem@…>

VFS: Mounted root (squashfs filesystem) readonly on device 31:2.

Freeing unused kernel memory: 136k freed

Please be patient, while OpenWrt loads ...

mini_fo: using base directory: /

mini_fo: using storage directory: /overlay

ath_hal: module license 'Proprietary' taints kernel.

Disabling lock debugging due to kernel taint

ath_hal: 2009-05-08 (AR5212, AR5312, RF5111, RF5112, RF2316, RF2317, REGOPS_FUNC, TX_DESC_SWAP, XR)

device eth0.1 entered promiscuous mode

device eth0 entered promiscuous mode

br-lan: port 1(eth0.1) entering forwarding state

device eth0.2 entered promiscuous mode

br-wan: port 1(eth0.2) entering forwarding state

ath_ahb: trunk

wlan: trunk

wlan: mac acl policy registered

ath_rate_minstrel: Minstrel automatic rate control algorithm 1.2 (trunk)

ath_rate_minstrel: look around rate set to 10%

ath_rate_minstrel: EWMA rolloff level set to 75%

ath_rate_minstrel: max segment size in the mrr set to 6000 us

Atheros HAL provided by OpenWrt, DD-WRT and MakSat Technologies

wifi0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps

wifi0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps

wifi0: turboG rates: 6Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps

wifi0: H/W encryption support: WEP AES AES_CCM TKIP

ath_ahb: wifi0: Atheros 2317 WiSoC REV2: mem=0xb0000000, irq=3

IRQ 3/wifi0: IRQF_DISABLED is not guaranteed on shared IRQs

device ath0 entered promiscuous mode

br-lan: port 2(ath0) entering forwarding state

device ath1 entered promiscuous mode

br-lan: port 3(ath1) entering forwarding state

br-lan: port 3(ath1) entering disabled state

br-lan: port 3(ath1) entering forwarding state

root@xtinkerbell:/# free

total used free shared buffers

Mem: 13136 10312 2824 0 1132

Swap: 0 0 0

Total: 13136 10312 2824

root@xtinkerbell:/# df

Filesystem 1K-blocks Used Available Use% Mounted on

/dev/root 1408 1408 0 100% /rom

tmpfs 6568 36 6532 1% /tmp

tmpfs 512 0 512 0% /dev

/dev/mtdblock3 1536 296 1240 19% /overlay

mini_fo:/overlay 1408 1408 0 100% /

root@xtinkerbell:/# ps

PID USER VSZ STAT COMMAND

1 root 1364 S init

2 root 0 SW< [kthreadd]

3 root 0 SW< [ksoftirqd/0]

4 root 0 SW< [events/0]

5 root 0 SW< [khelper]

8 root 0 SW< [async/mgr]

26 root 0 SW< [kblockd/0]

53 root 0 SW [pdflush]

54 root 0 SW [pdflush]

55 root 0 SW< [kswapd0]

56 root 0 SW< [aio/0]

57 root 0 SW< [crypto/0]

110 root 0 SW< [mtdblockd]

280 root 0 SWN [jffs2_gcd_mtd3]

297 root 1360 S /bin/ash --login

305 root 1416 S syslogd -C64

307 root 1352 S klogd

321 root 776 S /sbin/hotplug2 --override --persistent --set-worker /

796 root 1460 S hostapd -P /var/run/wifi-ath1.pid -B /var/run/hostapd

811 root 1144 S /usr/sbin/dropbear -p 222 -b /etc/banner -P /var/run/

837 nobody 912 S /usr/sbin/dnsmasq -K -D -N -q -R -y -Z -E -s chongfam

845 root 1360 S watchdog -t 5 /dev/watchdog

872 root 1356 R ps

root@xtinkerbell:/# opkg list_installed

base-files - 43.15-r26686

busybox - 1.15.3-2

dnsmasq - 2.55-6

dropbear - 0.52-4

hotplug2 - 1.0-beta-2

kernel - 2.6.30.10-1

kmod-madwifi - 2.6.30.10+r3314-4

libc - 0.9.30.1-43.15

libgcc - 4.3.3+cs-43.15

libnl-tiny - 0.1-1

libuci - 12012009.6-3

mtd - 13

opkg - 576-1

swconfig - 7

uci - 12012009.6-3

udevtrigger - 106-1

wireless-tools - 29-4

wpad-mini - 20110402-1

comment:22 in reply to: ↑ 20 Changed 7 years ago by vaden@…

Replying to JohnC60 <johnc60@…>:

The Airlink AR430W is similar to the D-Link DIR-300 and is probably similar to dozens of other 4-port wireless routers. It is your basic Atheros reference design.

The major hardware components are:

Atheros AR2318 WiSoC
IC Plus IP175C 5-port switch
4 MB flash memory
16 MB RAM

There's plenty of free memory. This router is extremely stable on OpenWrt, running for months without rebooting. It only gets into trouble when I upload a large amount of data at once from LAN to WAN, which seems to trigger a memory leak and crashes the router. If this upload issue can be identified and fixed, this would be a problem-free router.

Can you make the problem occur on the DIR-300 or another router which uses the same set of chips as the AirLink AR430W?

comment:23 Changed 7 years ago by JohnC60 <johnc60@…>

I have several AR430W's, which all have this problem. I don't have any DIR-300 routers.

-John

comment:24 in reply to: ↑ description Changed 7 years ago by anonymous

Replying to johnc60@…:

Test scenario: Load router with Backfire RC5.

What is the URL of the Backfire RC5 which you loaded onto the AirLink AR430W?

comment:25 Changed 7 years ago by JohnC60 <johnc60@…>

I get the RC5 OpenWrt here:

http://downloads.openwrt.org/snapshots/backfire/10.03.1-RC5-testing/atheros/openwrt-atheros-vmlinux.lzma
http://downloads.openwrt.org/snapshots/backfire/10.03.1-RC5-testing/atheros/openwrt-atheros-root.squashfs

In the process of debugging, I had removed all of the unnecessary packages, such as firewall and *ipt*, eliminating the firewall and NAT as an issue.

-John

comment:26 in reply to: ↑ 18 Changed 7 years ago by vaden@…

Replying to JohnC60 <johnc60@…>:

Any other ideas on what to look for?

Others may not agree, but I think iperf logs _from both ends_ would be interesting for a case for which it fails and a smaller case for which it does not fail.

Please use UDP mode for iperf for this particular test and then lather, rinse and repeat for TCP/IP mode.

comment:27 Changed 7 years ago by vaden@…

These thoughts occur:

1) is the signalling occurring which eventually slows the client to the a sustainable rate?

or

2) is there a memory deallocation problem?

or

3) yet another condition ...

comment:28 follow-up: Changed 7 years ago by JohnC60 <johnc60@…>

What signaling are you refering to? If you are talking about TCP then yes. The client's output is limited to about 70-75 KB/s which is consistent with a 768Kb/s upstream DSL. At this rate, however, the router starts to lose free memory and, over a period of time, will crash once the memory is exhausted.

There is definitely a memory deallocation or memory leak problem since the free memory does not increase after the upload has been stopped. If it was a normal memory consumption issue, then stopping the transfer should cause the free memory to return to more or less the starting value. It is also unreasonable to see a continous memory drain to zero.

As for iperf, I have already tested it. Look at the results posted about eight days ago. In summary, the router starts to lose memory once the iperf stream is above 650Kb/s. Beyond this, the higher the iperf data rate, the faster the router loses memory. At 1Mb/s, the router's memory is exhausted in about 20 seconds.

-John

comment:29 in reply to: ↑ 28 Changed 7 years ago by anonymous

Replying to JohnC60 <johnc60@…>:

What signaling are you refering to? If you are talking about TCP then yes. The client's output is limited to about 70-75 KB/s which is consistent with a 768Kb/s upstream DSL. At this rate, however, the router starts to lose free memory and, over a period of time, will crash once the memory is exhausted.

Are you saying the ftp|http client is successfully signalled that the maximum xfer rate of the upstream is ~ 768 Kbps (and therefore slows to <= 768 Kbps) and that even then the oom occurs?

comment:30 follow-up: Changed 7 years ago by JohnC60 <johnc60@…>

Yes, that is correct. If I use FTP or HTTP to upload data through the WAN, the upstream transfer is held by TCP to a consistent rate of about 768Kb/s. There is neglegible packet loss and in all respects, the upload is going on just fine. However, on the router, you can see constant free memory/buffer loss. Within about 10-15 minutes, the free memory down to near zero and OOM takes over.

-John

comment:31 Changed 7 years ago by JohnC60 <johnc60@…>

As mentioned before, this problem occurs in Kamikaze, Backfire, and the Attitude trunk so it's not a new problem. However, the free memory loss was not as quick in Kamikaze.

comment:32 in reply to: ↑ 30 Changed 7 years ago by anonymous

Replying to JohnC60 <johnc60@…>:

Yes, that is correct. If I use FTP or HTTP to upload data through the WAN, the upstream transfer is held by TCP to a consistent rate of about 768Kb/s. There is neglegible packet loss and in all respects, the upload is going on just fine. However, on the router, you can see constant free memory/buffer loss. Within about 10-15 minutes, the free memory down to near zero and OOM takes over.

-John

Based on the 'ps' you posted earlier, it is not that different than one from OpenWrt running on a Rocket M900, but there's no indication that I see of who owns the ever growing chunk of memory.

Since you haven't really attracted a developer's interest in the problem, you might try openwrt-users@… or openwrt-devel@… (or a cross post :) to ask how to determine what process owns the ever growing chunk of memory or just ask for what you want, namely a solution.

I say that because I don't have a feeling for the average age of open tickets and am too lazy to calculate said.

comment:33 Changed 7 years ago by JohnC60 <johnc60@…>

OK, not sure if this will help or hurt my chances of getting OpenWrt fixed... :)

I tested with DD-WRT and it has the same problem with losing free memory and crashing when the router is hit with a constant stream of upload traffic.

-John

comment:34 Changed 7 years ago by JohnC60 <johnc60@…>

I reloaded the original Airlink firmware (which is Linux based) and it has no problems with memory loss. The amount of free memory went down a little as the transfer was in progress but did not continously deplete and crash like OpenWrt and DD-WRT.

I was able to finish the upload my 180MB file. I can also hit the router with any traffic rate using iperf and the amount of free memory remained relatively constant.

The only big difference that I can see is that the Airlink firmware has a v2.4 kernel and OpenWrt/DD-WRT has v2.6 kernels.

-John

comment:35 Changed 6 years ago by acoul

I can confirm this issue on latest 3.2 & 3.3 kernel ports for this platform.

@JohnC60: have you been able to locate source code for the original firmware?

comment:36 Changed 6 years ago by johnc60@…

Thanks for confirming this problem. Unfortunately, I don't have the source code for the original firmware.

-John

comment:37 Changed 6 years ago by acoul

I was able to overcome the memory depletion resulting on a system crash upon high network traffic by freeing some memory on the router (a qubiquity wisp station based on atheros ar231x). I was running quagga/bgp on this router and the available free memory was near 1.5Mbytes. Replacing quagga/bgp with bird/bgp the free memory resources went up to 3.8Mbytes and since then the router has no issues on high network traffic.

comment:38 Changed 6 years ago by acoul

I still have issues on this router on a high udp traffic. I am now testing the following setup:

/bin/echo "96 128 192" > /proc/sys/net/ipv4/tcp_mem
/bin/echo "1" > /proc/sys/net/ipv4/tcp_rfc1337
/bin/echo "1" > /proc/sys/net/ipv4/tcp_workaround_signed_windows
/bin/echo "1" > /proc/sys/net/ipv4/tcp_low_latency
/bin/echo "2" > /proc/sys/net/ipv4/tcp_frto_response
/bin/echo "1" > /proc/sys/net/ipv4/tcp_abort_on_overflow
/bin/echo "1" > /proc/sys/net/ipv4/tcp_no_metrics_save
/bin/echo "128" > /proc/sys/net/ipv4/tcp_max_orphans
/bin/echo "128" > /proc/sys/net/ipv4/tcp_max_tw_buckets

/proc TCP values

comment:39 Changed 5 years ago by florian

  • Resolution set to wontfix
  • Status changed from new to closed

comment:40 Changed 4 years ago by jow

  • Milestone changed from Backfire 10.03.2 to Chaos Calmer (trunk)

Milestone Backfire 10.03.2 deleted

comment:41 Changed 3 years ago by panterames@…

  • Resolution wontfix deleted
  • Status changed from closed to reopened

Having the save issues on my DIR-320 A1 revision.....omg....so dull and long term problem...

comment:42 Changed 3 years ago by panterames@…

I am using openwrt-brcm47xx-generic-squashfs.trx firmware for testing, now I go to sleep....at last I read these post to the end....but my brcm47xx seems to be having save issues on the list. Please help to resolve these behavior in the next versions of open-wrt!

comment:43 Changed 3 years ago by anonymous

Please help to resolve these behavior in the next versions of open-wrt! I won't factory firmware ;/ Do something magically work ;D

comment:44 Changed 3 years ago by anonymous

the issue is linux kernel related. it has to do with the management of network buffers. I am unable to use new kernels on devices with <32M and that's sad since ipoenwrt came to life to such support little small devices.

My workaround to this problem since I am not a genius to locate and fix this nasty bug is to use Long Term kernel 2.6.32 on current openwrt and it works like a charm. It lacks all these nice new kernel features but who needs them on devices that don't work :-)

comment:45 Changed 3 years ago by anonymous

Ohh thanks to you, but where I can download precompiled version of Open-WRT with these kernel? Does it legacy or generic in old releases? Can you provide some link to *.trx firmware file?

comment:46 Changed 3 years ago by anonymous

YES! THANK YOU! IT'S WORK ;D I have uploaded Backfire 10.03.1 openwrt-brcm47xx-squashfs.trx and it is work like a charm! ;D

comment:47 Changed 3 years ago by anonymous

Nnno.......;/ I have the same behavior even under 10.03.1 2.6.32 backfire release....still these thing forced me to use default DIR-320 A1 1.22 bahr firmware from ftp.dlink.ru ;/....ahh....seems to be these openwrt-brcm47xx-squashfs.trx (10.03.1 2.6.32) does not make any difference in these terms...

comment:49 Changed 3 years ago by anonymous

good news for <32Mb RAM devices, after long long time, kernel 3.18 brings life back to these little devices again. Apparently this patch http://lwn.net/Articles/615243/ made the magic.

comment:50 Changed 3 years ago by anonymous

Are you sure that these patch would applies and resolve these bug? Otherwise I will glad to load my ubuntu and try to recompile all sort of things under 3.18...but I don't know what the state of readiness of that kernel in current OpenWRT stable branch, do I need to use developer snapshots for trying out new 3.18 kernel? Or it can be used in common release branch?

comment:51 Changed 3 years ago by anonymous

3.18 patches are already on openwrt trunk on generic and some other platforms but at this time you will need to cook yourself a custom snapshot. It may take some time until 3.18 openwrt official kernels hit the production cycle. Many more tests need to happen until anyone can claim anything. My issues though where resolved and for my case I claim huge satisfaction with kernel 3.18. feel free to share your own findings here too

comment:52 Changed 3 years ago by panterames@…

Hey I am just compiled trunk 3.14 CC branch (chaos calmer) with 3.18 patches I think, the bug are still persists for me...

I have 100Mb/s real bandwidth and with factory firmware my 320 A1 gives me at least 66Mbs downstream and about 55-60Mbs upstream.....compare it to my openwrt compilation for now:

Firmware Version OpenWrt Chaos Calmer r43769 / LuCI Master (git-14.352.03206-43c395a)
Kernel Version 3.14.26

these gives me only 34.5Mbs download and crashes and reboots my router when speedtest.net are trying to upload something to their servers ;/

These is ridiculous I am really disappoint with given situation...any more advices or tips to fix the problem?

comment:53 Changed 3 years ago by anonymous

I have tested LAN to LAN and I got full 100Mbs no problem, but when I get to WAN download is limited to 33Mbs and upload just reboots my openwrt router ;/

comment:54 Changed 3 years ago by anonymous

http://lwn.net/Articles/615243/ btw it is about cpu power not memory flaws......

comment:55 Changed 3 years ago by anonymous

Ahh ok I'll give up on trying to using open wrt on dir-320.....it is useless since hard coded brcm-2.4 kernel branch was dropped long time ago and I have no more patience to fight with these problematic broadcom hardware, at least open wrt give's me insight about what kind of hardware manufacturer broadcom is ;D...

Add Comment

Modify Ticket

Action
as reopened .
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.