Modify

Opened 4 years ago

Last modified 2 years ago

#14858 reopened defect

Router is not working properly when IPv6 tunnel is configured

Reported by: p.titera@… Owned by: developers
Priority: normal Milestone: Barrier Breaker 14.07
Component: packages Version: Trunk
Keywords: Cc:

Description

Somewhere after r39304 my TL-WR1043N/ND started to behave strangely after I configure IPv6 tunnel using 6in4 protocol.

  • firewal fails to fully start with error message 'Failed to connect to ubus'.
  • I see no transfer statistics of all interfaces in system
  • I see multiple blocked instances of udhcpc client running on WAN interface (see process list below from my testing environment).
  • I'm not able to get any information about interfaces using ifstatus. It cannot find any interface in system

Strange is that everything else seems to be working. On LAN I get both IPv4 and IPv6 addresses but because firewall is not running I'm not able to connect to any site in internet using IPv4.

Today I was able to to test new build on older TPLink 941 and I see nearly same behavior with one exception, on boot firewall is able to start, and I have connection, but I'm not able to restart it (same error message as above).

When I disable IPv6 network everything seems to be working properly. Everything seems to work properly when I use my testing router as DHCPv6 client of my main router (i.e. it gets all addresses, and connection behind it is working).

Below is process list taken from testing router running r39392 wirk IPv6 tunnel configured.

    1 root      1312 S    /sbin/procd
    2 root         0 SW   [kthreadd]
    3 root         0 SW   [ksoftirqd/0]
    4 root         0 SW   [kworker/0:0]
    5 root         0 SW<  [kworker/0:0H]
    7 root         0 SW<  [khelper]
    8 root         0 SW   [kworker/u2:1]
   58 root         0 SW<  [writeback]
   60 root         0 SW<  [bioset]
   62 root         0 SW<  [kblockd]
   92 root         0 SW   [kswapd0]
  137 root         0 SW   [fsnotify_mark]
  151 root         0 SW<  [ath79-spi]
  160 root         0 SW   [kworker/u2:2]
  278 root         0 SW   [kworker/0:2]
  288 root         0 SW<  [deferwq]
  354 root         0 SWN  [jffs2_gcd_mtd3]
  404 root       880 S    /sbin/ubusd
  405 root       772 S    /sbin/askfirst ttyS0 /bin/ash --login
  569 root         0 SW<  [cfg80211]
  728 root      1288 S    /sbin/logd
  729 root      1288 S    /sbin/logread -f -r 192.168.3.240 514 -p /var/run/logread.2.pid -u
  879 root      1352 S    /usr/sbin/telnetd -F -l /bin/login.sh
  901 root      1532 S    /usr/sbin/uhttpd -f -h /www -r firewall -x /cgi-bin -u /ubus -t 60 -T 30 -k 20 -A 1 -n 3 -N 100 -R -p 0.0.0.0 80
  963 root      1368 S    udhcpc -p /var/run/udhcpc-wan.pid -s /lib/netifd/dhcp.script -f -t 0 -i wan -C
 1096 root      1360 S    /usr/sbin/ntpd -n -p 0.openwrt.pool.ntp.org 1.openwrt.pool.ntp.org 2.openwrt.pool.ntp.org 3.openwrt.pool.ntp.org
 1102 root      1104 S    /usr/sbin/ntpclient -i 600 -s -l -D -p 123 -h 0.openwrt.pool.ntp.org
 1249 root      1368 S    udhcpc -p /var/run/udhcpc-wan.pid -s /lib/netifd/dhcp.script -f -t 0 -i wan -C
 1499 root      1368 S    udhcpc -p /var/run/udhcpc-wan.pid -s /lib/netifd/dhcp.script -f -t 0 -i wan -C
 1785 root      1368 S    udhcpc -p /var/run/udhcpc-wan.pid -s /lib/netifd/dhcp.script -f -t 0 -i wan -C
 2036 root      1368 S    udhcpc -p /var/run/udhcpc-wan.pid -s /lib/netifd/dhcp.script -f -t 0 -i wan -C
 2248 root      1100 S    /usr/sbin/dropbear -F -P /var/run/dropbear.1.pid -p 22
 2272 root      1168 S    /usr/sbin/dropbear -F -P /var/run/dropbear.1.pid -p 22
 2340 root      1368 S    udhcpc -p /var/run/udhcpc-wan.pid -s /lib/netifd/dhcp.script -f -t 0 -i wan -C
 2481 root       812 S    /usr/sbin/6relayd -l/tmp/hosts/6relayd /usr/sbin/6relayd-update -m1 -Rserver -Dserver . br-lan
 2573 nobody     908 S    /usr/sbin/dnsmasq -C /var/etc/dnsmasq.conf -k
 2579 root      1552 S    /usr/sbin/hostapd -P /var/run/wifi-phy0.pid -B /var/run/hostapd-phy0.conf
 2596 root      1364 S    -ash
 2606 root      1356 R    ps

I can provide my config and/or image if neccessary. I can do some simple tests too but as this is my main internet connection I cannot test for long time.

Petr Titera

Attachments (0)

Change History (11)

comment:1 Changed 4 years ago by p.titera@…

Just one correction. Version is obviously trunk of Barrier Breaker.

Petr Titera

comment:2 follow-up: Changed 4 years ago by hnyman <hannu.nyman@…>

You might have too old .config with old options and included packages: 6relayd has been replaced by odhcpd last week by r39309, but 6relayd is still visible in your process list. If you have built r39392 after properly using "make menuconfig" or "make defconfig" to refresh the dependencies, odhcpd should be built in and visible in the process list instead of 6relayd. (and you should also correspondingly remove 6relayd from .config)

I don't see /sbin/netifd in your process list. Its absence probably explains why netifd scripts do not complete(?) and might also be the reason for multiple udhcpc processes.

Just for reference, I am using ar71xx/wndr3700 trunk build 39397, and 6in4 tunnel works quite ok.

comment:3 in reply to: ↑ 2 Changed 4 years ago by p.titera@…

Replying to hnyman <hannu.nyman@…>:

You might have too old .config with old options and included packages: 6relayd has been replaced by odhcpd last week by r39309, but 6relayd is still visible in your process list. If you have built r39392 after properly using "make menuconfig" or "make defconfig" to refresh the dependencies, odhcpd should be built in and visible in the process list instead of 6relayd. (and you should also correspondingly remove 6relayd from .config)

I did try to use odhcpd for a while but it did not seem to be working right (or may be I cannot configure it properly). I always ended with this error message

daemon.warn odhcpd[645]: A default route is present but there is no public prefix on br-lan thus we don't announce a default route!

and had to force it to announce default route, which does not seem to be right. I had another problems with relaying request from behind next router to main DHCPv6 server when I used odhcpd so I have to switch back to 6relayd.

I don't see /sbin/netifd in your process list. Its absence probably explains why netifd scripts do not complete(?) and might also be the reason for multiple udhcpc processes.

The question is why it is not present. Correct me if I'm wrong but it had to be started (othewise interfaces would not be up at all) but it must crash before I took that list. It is definitely present when everything works on both (main and testing) routers.

Just for reference, I am using ar71xx/wndr3700 trunk build 39397, and 6in4 tunnel works quite ok.

Strange, but given that bot of my routers behave differently in details (on TL-WR1043 I have no connection at all because firewall is not started, on TL-WR941 firewall is runninig but cannot be modified at runtime) I would not be suprised that this configuration works on your router without any problems

comment:4 Changed 4 years ago by hnyman <hannu.nyman@…>

Oh, you have intentionally selected non-standard ipv6 support packages. 6relayd is not a long-term option as its development has been stopped. See https://github.com/sbyx

Netifd crash may be the concrete reason for the trouble, but the crash seems strange. Have you anything on the logs? Could some misconfiguration cause it? Should not. You might show here your /etc/config/network.

Regarding the odhcpd error message, I believe that it should have worked ok, if you have let it create the default config to /etc/config/dhcp. (It uses uci-defaults to modify that file at the first boot after flash. The script showing the settings can be seen as /rom/etc/uci-defaults/odhcpd.defaults in a running system.)

odhcpd config advice can be found at: https://github.com/sbyx/odhcpd

Just for reference, my 6in4 tunnel with odhcpd has these settings:

**** /etc/config/network :  ****
lan section has ipassign, while tunnel section defines tunnel and public prefixes.

config interface 'lan'
        option ifname 'eth0.1'
        option type 'bridge'
        option proto 'static'
        option ipaddr '192.168.1.1'
        option netmask '255.255.255.0'
        option ip6assign '60'

config interface 'wan'
        option ifname 'eth1'
        option proto 'dhcp'

config interface 'wan6'
        option ifname '@wan'
        option proto 'dhcpv6'

config interface 'sixxs'
        option proto '6in4'
        option mtu '1424'
        option peeraddr '62.78.96.38'
        option ip6addr '2001:tunnel:only:prefix::2/64'
        option ip6prefix '2001:public:prefix::/48'

**** /etc/config/dhcp: ****
odhcpd has added 3 lines to lan section and new odhcpd and wan6 sections.

config 'dhcp' 'lan'
        option 'interface' 'lan'
        option 'start' '100'
        option 'limit' '150'
        option 'leasetime' '12h'
        option dhcpv6 'hybrid'
        option ra 'hybrid'
        option ndp 'hybrid'

config 'dhcp' 'wan'
        option 'interface' 'wan'
        option 'ignore' '1'

config 'dhcp' 'sixxs'
        option 'ignore' '1'
        option 'interface' 'sixxs'
        option 'dynamicdhcp' '0'

config odhcpd 'odhcpd'
        option maindhcp '0'
        option leasefile '/tmp/hosts/odhcpd'
        option leasetrigger '/usr/sbin/odhcpd-update'

config dhcp 'wan6'
        option dhcpv6 'hybrid'
        option ra 'hybrid'
        option ndp 'hybrid'
        option master '1'

/etc/config/network was identical both with 6relayd and odhcpd.
I used unmodified default /etc/config/6relayd.

This was sixxs tunnel config. Henet config is otherwise similar, but has also the tunnelid, username and password lines.

(As far as I know, the wan6 interface is not actually needed for 6in4. It is there by default waiting for ISP to start giving out proper ipv6, and I have not removed it.)

comment:5 Changed 4 years ago by Petr.Titera

SO I've regenerated testing firmware using default compiler and with odhcpd included and unfortunately system behaves same as before.

Right after I save my 6in4 config /sbin/netifd stops and I see several instances of 6in4 startup scripts running:

\  397 root       888 S    /sbin/ubusd
  398 root       768 S    /sbin/askfirst ttyS0 /bin/ash --login
  562 root         0 SW<  [cfg80211]
  771 root      1308 S    /sbin/logd
  799 root      1184 S    /usr/sbin/odhcpd
  848 root         0 SW<  [kworker/0:1H]
  994 root      1472 S    /usr/sbin/telnetd -F -l /bin/login.sh
 1001 root      1488 S    udhcpc -p /var/run/udhcpc-wan.pid -s /lib/netifd/dhcp.script -f -t 0 -i wan -C
 1042 root      1552 S    /usr/sbin/uhttpd -f -h /www -r OpenWrt -x /cgi-bin -u /ubus -t 60 -T 30 -k 20 -A 1 -n 3 -N 100 -R -p 0.0.0.0 80
 1063 root      1100 S    /usr/sbin/ntpclient -i 600 -s -l -D -p 123 -h 0.openwrt.pool.ntp.org
 1077 root         0 SWN  [jffs2_gcd_mtd3]
 1106 root         0 SW   [kworker/u2:2]
 1121 root      1476 S    /usr/sbin/ntpd -n -p 0.openwrt.pool.ntp.org 1.openwrt.pool.ntp.org 2.openwrt.pool.ntp.org 3.openwrt.pool.ntp.org
 1475 root      1148 S    /usr/sbin/dropbear -F -P /var/run/dropbear.1.pid -p 22
 1481 root      1216 R    /usr/sbin/dropbear -F -P /var/run/dropbear.1.pid -p 22
 1482 root      1484 S    -ash
 1884 root         0 SW   [kworker/0:1]
 1932 root      1488 S    udhcpc -p /var/run/udhcpc-wan.pid -s /lib/netifd/dhcp.script -f -t 0 -i wan -C
 1951 root      1624 S    {6in4.sh} /bin/sh ./6in4.sh 6in4 setup wan6 {"proto":"6in4","peeraddr":"tunnel.provider.ip","ip6addr":"2001:tunnel.prefix","ip6prefix":"2001:7
 2000 root      1188 S    ubus call network.interface notify_proto { "action": 0, "ifname": "6in4-wan6", "link-up": true, "tunnel": { "mode": "sit", "mtu": 1280, "ttl
 2037 nobody     960 S    /usr/sbin/dnsmasq -C /var/etc/dnsmasq.conf -k
 2117 root      1488 S    udhcpc -p /var/run/udhcpc-wan.pid -s /lib/netifd/dhcp.script -f -t 0 -i wan -C
 2134 root      1624 S    {6in4.sh} /bin/sh ./6in4.sh 6in4 setup wan6 {"proto":"6in4","peeraddr":"tunnel.provider.ip","ip6addr":"2001:tunnel.prefix","ip6prefix":"2001:7
 2191 root      1188 S    ubus call network.interface notify_proto { "action": 0, "ifname": "6in4-wan6", "link-up": true, "tunnel": { "mode": "sit", "mtu": 1280, "ttl
 2291 root      1488 S    udhcpc -p /var/run/udhcpc-wan.pid -s /lib/netifd/dhcp.script -f -t 0 -i wan -C
 2312 root      1624 S    {6in4.sh} /bin/sh ./6in4.sh 6in4 setup wan6 {"proto":"6in4","peeraddr":"tunnel.provider.ip","ip6addr":"2001:tunnel.prefix","ip6prefix":"2001:7
 2363 root      1188 S    ubus call network.interface notify_proto { "action": 0, "ifname": "6in4-wan6", "link-up": true, "tunnel": { "mode": "sit", "mtu": 1280, "ttl
 2405 root      1524 S    /sbin/netifd
 2478 root      1480 S    udhcpc -p /var/run/udhcpc-wan.pid -s /lib/netifd/dhcp.script -f -t 0 -i wan -C
 2535 root      1528 S    {hotplug-call} /bin/sh /sbin/hotplug-call iface
 2546 root      1528 S    {hotplug-call} /bin/sh /sbin/hotplug-call iface
 2547 root      1524 R    {qos} /bin/sh /etc/rc.common /etc/init.d/qos enabled
 2548 root      1476 R    ps

And after a while everything stops in state described above:

  397 root       888 S    /sbin/ubusd
  398 root       768 S    /sbin/askfirst ttyS0 /bin/ash --login
  562 root         0 SW<  [cfg80211]
  771 root      1308 S    /sbin/logd
  799 root      1184 S    /usr/sbin/odhcpd
  848 root         0 SW<  [kworker/0:1H]
  994 root      1472 S    /usr/sbin/telnetd -F -l /bin/login.sh
 1001 root      1488 S    udhcpc -p /var/run/udhcpc-wan.pid -s /lib/netifd/dhcp.script -f -t 0 -i wan -C
 1042 root      1552 S    /usr/sbin/uhttpd -f -h /www -r OpenWrt -x /cgi-bin -u /ubus -t 60 -T 30 -k 20 -A 1 -n 3 -N 100 -R -p 0.0.0.0 80
 1063 root      1100 S    /usr/sbin/ntpclient -i 600 -s -l -D -p 123 -h 0.openwrt.pool.ntp.org
 1077 root         0 SWN  [jffs2_gcd_mtd3]
 1106 root         0 SW   [kworker/u2:2]
 1121 root      1480 S    /usr/sbin/ntpd -n -p 0.openwrt.pool.ntp.org 1.openwrt.pool.ntp.org 2.openwrt.pool.ntp.org 3.openwrt.pool.ntp.org
 1475 root      1148 S    /usr/sbin/dropbear -F -P /var/run/dropbear.1.pid -p 22
 1481 root      1216 S    /usr/sbin/dropbear -F -P /var/run/dropbear.1.pid -p 22
 1482 root      1484 S    -ash
 1884 root         0 SW   [kworker/0:1]
 1932 root      1488 S    udhcpc -p /var/run/udhcpc-wan.pid -s /lib/netifd/dhcp.script -f -t 0 -i wan -C
 2117 root      1488 S    udhcpc -p /var/run/udhcpc-wan.pid -s /lib/netifd/dhcp.script -f -t 0 -i wan -C
 2291 root      1488 S    udhcpc -p /var/run/udhcpc-wan.pid -s /lib/netifd/dhcp.script -f -t 0 -i wan -C
 2478 root      1488 S    udhcpc -p /var/run/udhcpc-wan.pid -s /lib/netifd/dhcp.script -f -t 0 -i wan -C
 2839 root      1488 S    udhcpc -p /var/run/udhcpc-wan.pid -s /lib/netifd/dhcp.script -f -t 0 -i wan -C
 3048 nobody     960 S    /usr/sbin/dnsmasq -C /var/etc/dnsmasq.conf -k
 3117 root      1476 R    ps

The only thing I can see in logs (except of lots of messages described in #14202) is this:

 Sun Jan 26 12:44:23 2014 user.emerg syslog: Instance network::instance1 s in a crash loop 6 crashes, 7 seconds since last crash

These are my /etc/config/network

config interface 'loopback'
        option ifname 'lo'
        option proto 'static'
        option ipaddr '127.0.0.1'
        option netmask '255.0.0.0'

config globals 'globals'
        option ula_prefix 'fdd5:3168:b520::/48'

config interface 'eth'
        option ifname 'eth0'
        option proto 'none'

config interface 'lan'
        option ifname 'lan1 lan2 lan3 lan4'
        option type 'bridge'
        option proto 'static'
        option ipaddr '192.168.1.1'
        option netmask '255.255.255.0'
        option ip6assign '60'

config interface 'wan'
        option ifname 'wan'
        option proto 'dhcp'

config interface 'wan6'
        option _orig_ifname '@wan'
        option _orig_bridge 'false'
        option proto '6in4'
        option peeraddr 'tunnel.provider.ip'
        option ip6addr '2001:tunnel:prefix'
        option ip6prefix '2001:network:prefix/48'

and /etc/config/dhcpd

 config dnsmasq
        option domainneeded '1'
        option boguspriv '1'
        option filterwin2k '0'
        option localise_queries '1'
        option rebind_protection '1'
        option rebind_localhost '1'
        option local '/lan/'
        option domain 'lan'
        option expandhosts '1'
        option nonegcache '0'
        option authoritative '1'
        option readethers '1'
        option leasefile '/tmp/dhcp.leases'
        option resolvfile '/tmp/resolv.conf.auto'

config dhcp 'lan'
        option interface 'lan'
        option start '100'
        option limit '150'
        option leasetime '12h'
        option dhcpv6 'hybrid'
        option ra 'hybrid'
        option ndp 'hybrid'

config dhcp 'wan'
        option interface 'wan'
        option ignore '1'

config odhcpd 'odhcpd'
        option maindhcp '0'
        option leasefile '/tmp/hosts/odhcpd'
        option leasetrigger '/usr/sbin/odhcpd-update'

config dhcp 'wan6'
        option dhcpv6 'hybrid'
        option ra 'hybrid'
        option ndp 'hybrid'
        option master '1'

As you can see nothing special, I did reset to default before I've started my test.

comment:6 Changed 4 years ago by hnyman <hannu.nyman@…>

I think it goes wrong when you are trying to re-use "wan" as "wan6" and use it for the 6in4 tunnel. 6in4 traffic is ipv6 packets encapsulated into ipv4 and then sent via wan. Having the "6in4 wan6" and the normal "wan" pointing to the same interface might be the reason for the netifd crash. These two lines are the strange looking ones for a 6in4 tunnel:

option _orig_ifname '@wan'
option _orig_bridge 'false'

And on the dhcp side there is also "hybrid" dhcpv6 config defined for it, when the tunnel does not need it.

You might try having a completely separate interface for the 6in4 tunnel. Leave wan6 alone and define a new interface for the tunnel (and add that interface to firewall's wan zone).

comment:7 Changed 4 years ago by Petr.Titera

I might find a cause of all of this. I might be my fault if you look at my posted config you will notice that I did not include /64 in my configuration of local tunnel network. If I add that missing /64 everything seems to work normally. Even odhcp seems to be working correctly now. Sorry for the noise.

The only thing to solve is, why old release was able to run with this king of config error.

comment:8 Changed 4 years ago by cyrus

  • Resolution set to fixed
  • Status changed from new to closed

Fixed in r39586.

comment:9 Changed 4 years ago by jow

  • Milestone changed from Attitude Adjustment 12.09 to Barrier Breaker 14.07

Milestone Attitude Adjustment 12.09 deleted

comment:10 Changed 3 years ago by sachin0235@…

  • Resolution fixed deleted
  • Status changed from closed to reopened

Hi,
I am getting the same issue while running BB 14.07 on TP Link tl-wr841nd V9. These logs start coming as soon as hotspot is switched on(coovachilli)

Mon Feb 23 00:16:22 2015 local6.notice coova-chilli[1266]: chilli.c: 5005: Client MAC=C4-6E-1F-F4-B9-50 assigned IP 192.168.182.4
Mon Feb 23 00:16:25 2015 daemon.info dnsmasq-dhcp[1101]: DHCPDISCOVER(br-lan) 60:fa:cd:d1:45:ce
Mon Feb 23 00:16:25 2015 daemon.info dnsmasq-dhcp[1101]: DHCPOFFER(br-lan) 192.168.2.158 60:fa:cd:d1:45:ce
Mon Feb 23 00:16:25 2015 daemon.info dnsmasq-dhcp[1101]: DHCPREQUEST(br-lan) 192.168.182.3 60:fa:cd:d1:45:ce
Mon Feb 23 00:16:25 2015 daemon.info dnsmasq-dhcp[1101]: DHCPNAK(br-lan) 192.168.182.3 60:fa:cd:d1:45:ce wrong server-ID
Mon Feb 23 00:16:25 2015 daemon.warn odhcpd[751]: A default route is present but there is no public prefix on br-lan thus we don't announce a default route!
Mon Feb 23 00:16:29 2015 daemon.warn odhcpd[751]: A default route is present but there is no public prefix on br-lan thus we don't announce a default route!
Mon Feb 23 00:17:14 2015 local6.notice coova-chilli[1266]: chilli.c: 5873: Successful UAM login from username=919810135802 IP=192.168.182.3
Mon Feb 23 00:18:29 2015 local6.warn coova-chilli[1501]: redir.c: 54: Client process timed out: 1
Mon Feb 23 00:18:30 2015 local6.warn coova-chilli[1506]: redir.c: 54: Client process timed out: 1
Mon Feb 23 00:18:41 2015 daemon.warn odhcpd[751]: A default route is present but there is no public prefix on br-lan thus we don't announce a default route!
Mon Feb 23 00:19:13 2015 daemon.warn odhcpd[751]: A default route is present but there is no public prefix on br-lan thus we don't announce a default route!
Mon Feb 23 00:21:45 2015 authpriv.info dropbear[1581]: Child connection from 192.168.2.111:57698
Mon Feb 23 00:21:47 2015 authpriv.notice dropbear[1581]: Password auth succeeded for 'root' from 192.168.2.111:57698
Mon Feb 23 00:22:04 2015 daemon.info dnsmasq-dhcp[1101]: DHCPINFORM(br-lan) 192.168.2.111 f0:de:f1:b6:73:ff
Mon Feb 23 00:22:04 2015 daemon.info dnsmasq-dhcp[1101]: DHCPACK(br-lan) 192.168.2.111 f0:de:f1:b6:73:ff DELsyada685650
Mon Feb 23 00:22:34 2015 daemon.warn odhcpd[751]: A default route is present but there is no public prefix on br-lan thus we don't announce a default route!

comment:11 Changed 3 years ago by sachin0235@…

Further more if i check the box in Network -> Interface -> br-lan -> (below on the page) ipv6 settings -> Announce as default router even if no public prefix is available. the problem goes away. May be something has to do with configuration. My apologies if i sound stupid and reopened the defect but i am a newbie here.

Add Comment

Modify Ticket

Action
as reopened .
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.