Modify

Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#4489 closed defect (wontfix)

connect() fails on some IPs

Reported by: cglx Owned by: developers
Priority: high Milestone: Kamikaze 8.09 RC2
Component: kernel Version:
Keywords: Cc:

Description

When doing a socket's connect() (c/c++) it fails to certain IPs. The hosts are obviously up and accepting connections.

The program works perfectly on a i386 linux box. Also tried uClib (0.30) for i686, and works correctly.

I've tried with openwrt:
svn (dec 15 or something) kernel 2.6 and uclib 0.29
svn (jan 22) kernel 2.6 and uclib 0.30

I also tried dd-wrt 0.24 with identical results.

I'm attaching the program I use to test. It has good IPs and 2 IPs that fail.

Attachments (1)

c.c (1.4 KB) - added by cglx 9 years ago.
simple connect() program

Download all attachments as: .zip

Change History (8)

Changed 9 years ago by cglx

simple connect() program

comment:1 Changed 9 years ago by oliver@…

s_addr / sin_port need to be in network byte order. Your code appears to assume that it is running on a little-endian platform and thus you have manually byteorder-reversed port / IP constants in there.

If you run on a big-endian platform those values will be wrong.

Not sure how that would explain a partial failure only on some IPs though.

Anyway, use htonl() and friends.

comment:2 Changed 9 years ago by cglx

The program is just a test code to demonstrate the failure. The actual program is much larger and uses htons().

The platform I'm using is a wl-500gpv2 so it uses mipsel (little endian). The "good" IPs the test code are ok, and with them, it fetches the correct pages (on the router).

comment:3 Changed 9 years ago by oliver@…

The next step, then, would to be to look at a tcpdump, maybe you have problems with ECN or one of the other TCP quirks that may be configured differently between systems.

(much like diagnosing a networking problem on any platform, really)

comment:4 Changed 9 years ago by oliver@…

Also, some useful information to have is things like - kernel version - platform (ok, we got that now) - what do you mean by "it fails"?

comment:5 Changed 9 years ago by oliver@…

It's ECN

kamikaze router box, tcp_ecn = 1:

13:19:00.276639 IP 222.154.181.253.54111 > 148.244.43.5.80: SWE 3677582569:3677582569(0) win 5840 <mss 1460,nop,nop,sackOK,nop,wscale 1>
13:19:03.272221 IP 222.154.181.253.54111 > 148.244.43.5.80: SWE 3677582569:3677582569(0) win 5840 <mss 1460,nop,nop,sackOK,nop,wscale 1>
13:19:09.272256 IP 222.154.181.253.54111 > 148.244.43.5.80: SWE 3677582569:3677582569(0) win 5840 <mss 1460,nop,nop,sackOK,nop,wscale 1>
13:19:21.248715 IP 222.154.181.253.54111 > 148.244.43.5.80: SWE 3677582569:3677582569(0) win 5840 <mss 1460,nop,nop,sackOK,nop,wscale 1>

x86_64 Ubuntu, tcp_ecn = 0:

13:19:24.132097 IP 222.154.181.253.43086 > 148.244.43.5.80: S 1061065730:1061065730(0) win 5840 <mss 1460,sackOK,timestamp 1378426 0,nop,wscale 7>
13:19:24.370183 IP 148.244.43.5.80 > 222.154.181.253.43086: S 1387341707:1387341707(0) ack 1061065731 win 65535 <mss 1460,nop,wscale 0,nop,nop,timestamp 0 0,nop,nop,sackOK>
13:19:24.371673 IP 222.154.181.253.43086 > 148.244.43.5.80: . ack 1 win 46 <nop,nop,timestamp 1378486 0>
13:19:27.409457 IP 222.154.181.253.43086 > 148.244.43.5.80: F 1:1(0) ack 1 win 46 <nop,nop,timestamp 1379246 0>
13:19:27.647635 IP 148.244.43.5.80 > 222.154.181.253.43086: . ack 2 win 65535 <nop,nop,timestamp 164889590 1379246>
13:19:27.648468 IP 148.244.43.5.80 > 222.154.181.253.43086: F 1:1(0) ack 2 win 65535 <nop,nop,timestamp 164889590 1379246>
13:19:27.650172 IP 222.154.181.253.43086 > 148.244.43.5.80: . ack 2 win 46 <nop,nop,timestamp 1379306 164889590>

kamikaze router box, tcp_ecn = 0:

13:19:55.161693 IP 222.154.181.253.54112 > 148.244.43.5.80: S 224991279:224991279(0) win 5840 <mss 1460,nop,nop,sackOK,nop,wscale 1>
13:19:55.399732 IP 148.244.43.5.80 > 222.154.181.253.54112: S 1464810580:1464810580(0) ack 224991280 win 65535 <mss 1460,nop,wscale 0,nop,nop,sackOK>
13:19:55.401228 IP 222.154.181.253.54112 > 148.244.43.5.80: . ack 1 win 2920
13:20:00.830724 IP 222.154.181.253.54112 > 148.244.43.5.80: F 1:1(0) ack 1 win 2920
13:20:01.070331 IP 148.244.43.5.80 > 222.154.181.253.54112: . ack 2 win 65535
13:20:01.070421 IP 148.244.43.5.80 > 222.154.181.253.54112: F 1:1(0) ack 2 win 65535
13:20:01.072571 IP 222.154.181.253.54112 > 148.244.43.5.80: . ack 2 win 2920

So there is a broken firewall/router in front of that IP that doesn't handle the ECN bit correctly. Not a bug at the openwrt end..

comment:6 follow-up: Changed 9 years ago by nico

  • Resolution set to wontfix
  • Status changed from new to closed

As Oliver suggests, disable ECN by changing the following line in '/etc/sysctl.conf'

from:

net.ipv4.tcp_ecn=1

to:

net.ipv4.tcp_ecn=0

comment:7 in reply to: ↑ 6 Changed 9 years ago by anonymous

I can confirm it works with tcp_ecn = 0

Thanks for your time.

Add Comment

Modify Ticket

Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.