#8520 closed defect (fixed)
wr741nd: poor tcp throughput on wan interface for hosts with big latencies
Reported by: | fercerpav@… | Owned by: | developers |
---|---|---|---|
Priority: | response-needed | Milestone: | Barrier Breaker 14.07 |
Component: | packages | Version: | Trunk |
Keywords: | Cc: | nbd@…, zdenek.koprivik@… |
Description
KOPRajs irc user reports
12:15 < KOPRajs> low download speeds 12:15 < KOPRajs> it seems to depend on what server I'm downloading from 12:17 < KOPRajs> the higher is the ping to the server the lower is the speed... reverting on original firmware or on dd-wrt there's no such problem 12:18 < KOPRajs> when downloading from debian.org I have about 10x slower speed than with original firmware or with dd-wrt 12:18 < KOPRajs> no QoS 12:18 < KOPRajs> wireshart shows there's log of "tcp last segment lost" and then duplicate acks 12:19 < KOPRajs> any iso from debian.org ati.com and many more 12:19 < KOPRajs> I'm at czech 12:20 < KOPRajs> when downloading from local .cz servers it is better, about half of the normal speed 12:22 < KOPRajs> if you connect without the router you'll get full speed or if you revert to original firmware 12:23 < KOPRajs> downloading from server connected directly to the WAN port I can get 100Mbit/s
I tried downloading an iso from debian.org and indeed the download
speed is rather low. I'd like to try to test this with netem
simulating big latency but i can't yet see how to do it without
disrupting my current setup.
Attachments (0)
Change History (17)
comment:1 Changed 7 years ago by zdenek.koprivik@…
comment:2 Changed 7 years ago by fercerpav@…
I did some tests with
tc qdisc modify dev eth0 root netem delay 10ms
Both wan and lan interfaces are affected, the problem looks quite severe, with 100ms latency the throughput is down to few megabits per second.
Here's the result with 10ms latency, iperf done from router to the laptop directly connected to it with a patchcord to the lan interface:
Client connecting to 192.168.1.100, TCP port 5001 TCP window size: 16.0 KByte (default) ------------------------------------------------------------ [ 3] local 192.168.1.1 port 49917 connected with 192.168.1.100 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-20.0 sec 88.5 MBytes 37.1 Mbits/sec ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 192.168.1.1 port 5001 connected with 192.168.1.100 port 42840 [ 4] 0.0-20.0 sec 112 MBytes 46.9 Mbits/sec
As you can see, easily reproducible :)
comment:3 Changed 7 years ago by nbd
please try to use one of the lan ports as a wan port (using vlan) to see if that makes a difference.
i'd like to know if only the actual wan port is problematic or if it's a problem on lan too
comment:4 Changed 7 years ago by zdenek.koprivik@…
The problem affects all ethernet ports (both WAN and LAN).
Router in standard mode:
- download without router -> good speed
- download on the router (wget -O /dev/null) (traffic goes only through WAN) -> bad speed
- download from the PC connected to the LAN (traffic goes through both WAN and LAN) -> even worse speed
- download from the PC connected to the Wireless (traffic goes through WAN and WLAN) -> again 'just' bad speed
Router in wireless client mode:
- download on the router (wget -O /dev/null) (traffic goes only through WLAN) -> good speed!
- download from the PC connected to the LAN (traffic goes through WLAN and LAN) -> bad speed
So the effect stacks. The more ethernet ports are in the way the worse is the speed.
Wireshark shows many lost TCP segments resulting in reset of TCP window. On the hosts with very low latency <10ms the TCP window is quickly reopened so the speed is not affected much but the higher is the host latency the slower is TCP window reopening and that affects the average speed.
So it seems to me that the driver probably simply loose a packet from time to time but I'm unable to prove that.
Again, neither original TP-Link firmware nor DD-WRT (both use ag7240) are affected so this must be a software problem specific to OpenWRT (ag71xx).
Thank you for any help on this.
comment:5 Changed 7 years ago by nbd
Please also try to figure out if packets get dropped in rx or in tx direction.
comment:6 Changed 7 years ago by zdenek.koprivik@…
It seems to affect both rx and tx.
Also please note that it looks like it doesn't depend on the traffic load.
comment:7 Changed 7 years ago by nbd
Please try applying this patch onto your kernel tree and see if it helps with this issue: http://nbd.name/flowcontrol.patch
comment:8 Changed 7 years ago by zdenek.koprivik@…
Hi,
sorry, the above patch doesn't help. The average download speed is even a bit worse than before.
I've made a few IO graphs showing the problem when downloading an .iso from debian.org:
This is with the original TP-LINK firmware (the same is for DD-WRT):
http://speedtest.mx-net.cz/tl-wr741nd_original.png
The average download speed is above 900kB/s which is the full speed of this link (8Mbit) and is the same with PC connected directly without the router. Also note that there are no TCP previous segment lost errors in the log.
This is with my OpenWRT build (the same is for official OpenWRT images):
http://speedtest.mx-net.cz/tl-wr741nd_mx-home_unpatched.png
The average speed is about 400kB/s. Every drop corresponds with one TCP previous segment lost error in log.
This is with the above patch applied:
http://speedtest.mx-net.cz/tl-wr741nd_mx-home_patched.png
The average speed is about 350kB/s.
comment:9 Changed 7 years ago by anonymous
Hi,
anything new on this? I've tried the same tests on UBNT Routerstation Pro (which is also using ag71xx driver but has a different chipset) and there is no such problem there. So the problem seems to be specific for the ag7240 chipset only.
So far the problem is confirmed for WR741ND and WR941ND.
If it helps I'm willing to send one WR741ND to nbd for testing. If interested, please let me know on zdenek.koprivik*post.cz.
comment:10 Changed 7 years ago by joe
Hi, I'm having these issues as well. My TP-LINK WR941ND has with OpenWRT huge connection speed loss to distant servers. DD-WRT was without this issue, but OpenWRT is much better for me, so I would love to see this bug fixed. Is there something I can do to help this issue get solved even though I'm not a kernel developer?
Thanks!
comment:11 Changed 7 years ago by nbd
Please try latest trunk. Make sure you've run make target/linux/clean and make oldconfig after svn update if it's not a fresh build.
comment:12 Changed 7 years ago by nbd
Any news on testing latest trunk?
comment:13 Changed 7 years ago by nbd
- Priority changed from normal to response-needed
comment:14 Changed 7 years ago by zdenek.koprivik@…
I've tried the trunk at sunday. I just didn't want to be too optimistic so I've waited with the report and I plan to do more testing during this weekend.
But so far so good.
Tested versions are WR741ND and WR941ND.
comment:15 Changed 7 years ago by joe
I can happily confirm, that using trunk is TP-LINK WR941ND working flawlessly again!
Thanks!
comment:16 Changed 7 years ago by loswillios
- Resolution set to fixed
- Status changed from new to closed
comment:17 Changed 4 years ago by jow
- Cc changed from nbd@openwrt.org,zdenek.koprivik@post.cz to nbd@openwrt.org, zdenek.koprivik@post.cz
- Milestone changed from Attitude Adjustment 12.09 to Barrier Breaker 14.07
Milestone Attitude Adjustment 12.09 deleted
Hi,
I'm the "IRC user KOPRajs" from the above post. I just want to be more specific on the described problem.
Observations:
Conclusions:
Please ag71xx developers investigate this big ag71xx performance problem.
Thank You