Modify

Opened 5 years ago

Last modified 18 months ago

#12888 reopened defect

ipv6 connectivity lost after some time

Reported by: anonymous Owned by: cyrus
Priority: normal Milestone: Chaos Calmer 15.05
Component: packages Version: Trunk
Keywords: Cc:

Description

r35310, ipv6-support, dhcpv6 used.

it seems odhcpv6 fails to refresh the lease, as br-lan looses it's global prefix (ula remains and works). I haven't been able to time it perfectly but it seems to happen ~24 hours after bringing up the interface (which is also my ISP's lease period).

odhcp6c also gives some strange log messages (4294967295s timeout) when acquiring the lease;

Jan 25 17:21:17 xxxx daemon.notice netifd: wan6 (10641): odhcp6c[10641]: State for pppoe-wan changed to started
Jan 25 17:21:17 xxxx daemon.warn odhcp6c[10641]: State for pppoe-wan changed to started
Jan 25 17:21:17 xxxx daemon.notice netifd: wan6 (10641): odhcp6c[10641]: Sending SOLICIT (timeout 4294967295s)
Jan 25 17:21:17 xxxx  daemon.notice odhcp6c[10641]: Sending SOLICIT (timeout 4294967295s)
Jan 25 17:21:17 xxxx daemon.notice odhcp6c[10641]: Got a valid reply after 24ms
Jan 25 17:21:17 xxxx daemon.notice netifd: wan6 (10641): odhcp6c[10641]: Got a valid reply after 24ms

and if run manually:

root@xxxx:~# odhcp6c -s /lib/netifd/dhcpv6.script -P0 pppoe-wan
odhcp6c[15196]: State for pppoe-wan changed to started
odhcp6c[15196]: Sending SOLICIT (timeout 4294967295s)
odhcp6c[15196]: Got a valid reply after 24ms
6relayd[15199]: No relays enabled or no slave interfaces specified. stopped.
odhcp6c[15196]: Sending REQUEST (timeout 4294967295s)
odhcp6c[15196]: Got a valid reply after 25ms
odhcp6c[15196]: State for pppoe-wan changed to bound
odhcp6c[15196]: Sending <POLL> (timeout 4294967295s)
Command failed: Not found
^C
odhcp6c[15196]: State for pppoe-wan changed to unbound
odhcp6c[15196]: Sending RELEASE (timeout 3s)
odhcp6c[15196]: State for pppoe-wan changed to stopped
root@xxxx:~# Command failed: Not found

I've run tcpdump to verify lease times, my ISP uses the following:
Preferred lifetime: 43200
Valid lifetime: 86400

Attachments (0)

Change History (24)

comment:1 Changed 5 years ago by cyrus

Cannot really reproduce this error. Restarting wan6 or wan all works fine for me. Maybe some special thing in connection with your ISP?

Is there any more odhcp6c related log when the reconnect happens?
Any REQUEST / REPLY happening after the SOLICIT and the received reply?

The high number for timeout for SOLICIT is btw. ok.

comment:2 Changed 5 years ago by cyrus

  • Owner changed from developers to cyrus
  • Status changed from new to assigned

comment:3 Changed 5 years ago by anonymous

Today I got home to find IPv6 connectivity lost again. This time br-lan still holds my valid global prefix, yet neither the router itself or lan clients can use ipv6 internet..

Previous times (although after waiting longer), I've seen br-lan also loose it's global prefix.

The most informative output I can find is the following;

#ip -6 route show
2axx:x80:xxxx:xxxx::/64 dev br-lan  proto kernel  metric 256  expires 86195sec
unreachable 2axx:x80:xxxx:xxxx::/56 dev lo  proto static  metric 2147483647  error -128
fdax::/64 dev br-lan  proto kernel  metric 256
unreachable fdax::/48 dev lo  proto static  metric 2147483647  error -128
fe80::/64 dev eth0  proto kernel  metric 256
fe80::/64 dev eth0.2  proto kernel  metric 256
fe80::/64 dev br-lan  proto kernel  metric 256
fe80::/64 dev wlan1.sta1  proto kernel  metric 256
fe80::/10 dev pppoe-wan  metric 1
fe80::/10 dev pppoe-wan  proto kernel  metric 256

look at the routing metric and error.

comment:4 Changed 5 years ago by anonymous

I haven't been able to make a packet capture when things go wrong. I don't know how to reproduce it and I never seem to be home when it happens. Although the 24 hour from interface up to the problem occurring seems to remain consistent.

comment:5 Changed 5 years ago by anonymous

Additionally, attempting to fix this - running "/etc/init.d/network restart" once didn't get ipv6 connectivity back. However after running it a second time, now it works. Strange things...

comment:6 Changed 5 years ago by anonymous

Actually running "/etc/init.d/network restart" twice only fixed local ipv6 connectivity, LAN clients are still broken.

comment:7 Changed 5 years ago by anonymous

I'll leave ipv6 connectivity in broken state for now and maybe you can tell me what I can do to help debug this.

comment:8 Changed 5 years ago by anonymous

It turns out if I just replug the wan port, the result is the same. after plugging in wan port again the ppp connection is restored with working ipv4 - but for ipv6, the default route will have been lost, so no ipv6 access.

comment:9 Changed 5 years ago by cyrus

Should be fixed now. r35345

comment:10 Changed 5 years ago by cyrus

  • Resolution set to fixed
  • Status changed from assigned to closed

comment:11 Changed 5 years ago by GrimDemon <fithox@…>

  • Resolution fixed deleted
  • Status changed from closed to reopened

I have normal DHCP/DHCPv6 connection and I have same problem as OP. r35412

comment:12 Changed 5 years ago by cyrus

  • Resolution set to fixed
  • Status changed from reopened to closed

There have been major changes to the mentioned subsystems after your revision. please retry with trunk and if the problem persists please post an output of the adresses on the router the routing table and any relevant logs

comment:13 Changed 5 years ago by Viper <viper0508@…>

  • Resolution fixed deleted
  • Status changed from closed to reopened

Hello, same problem here, using trunk r36641, after about 24 hours (maybe less) since first connect, ipv6 connectivity lost on clients and router itself, issuing ifup/ifdown on wan6 restores connectivity

root@gw-rds:~# ip -6 route
2axx:2fxx:xxe0:8b::/64 dev br-lan  proto kernel  metric 256
unreachable 2axx:2fxx:xxe0:8b::/64 dev lo  proto static  metric 2147483647  error -128
fe80::/64 dev br-lan  proto kernel  metric 256
fe80::/64 dev eth1  proto kernel  metric 256
fe80::/10 dev pppoe-wan  metric 1
fe80::/10 dev pppoe-wan  proto kernel  metric 256
root@gw-rds:~# ifdown wan6
root@gw-rds:~# ip -6 route
2axx:2fxx:xxe0:8b::/64 dev br-lan  proto kernel  metric 256
fe80::/64 dev br-lan  proto kernel  metric 256
fe80::/64 dev eth1  proto kernel  metric 256
fe80::/10 dev pppoe-wan  metric 1
fe80::/10 dev pppoe-wan  proto kernel  metric 256
root@gw-rds:~# ifup wan6
root@gw-rds:~# ip -6 route
2axx:2fxx:xxe0:8b::/64 dev br-lan  proto kernel  metric 256
unreachable 2axx:2fxx:xxe0:8b::/64 dev lo  proto static  metric 2147483647  error -128
fe80::/64 dev br-lan  proto kernel  metric 256
fe80::/64 dev eth1  proto kernel  metric 256
fe80::/10 dev pppoe-wan  metric 1
fe80::/10 dev pppoe-wan  proto kernel  metric 256
root@gw-rds:~# ping6 google.com
PING google.com (2a00:1450:400d:804::1005): 56 data bytes
64 bytes from 2a00:1450:400d:804::1005: seq=0 ttl=55 time=39.544 ms
64 bytes from 2a00:1450:400d:804::1005: seq=1 ttl=55 time=38.867 ms
May 17 10:22:40 gw-rds daemon.warn 6relayd[1484]: Termination requested by signal.
May 17 10:22:44 gw-rds daemon.notice netifd: Interface 'wan6' is now down
May 17 10:22:44 gw-rds daemon.info dnsmasq[1516]: reading /tmp/resolv.conf.auto
May 17 10:22:44 gw-rds daemon.info dnsmasq[1516]: using nameserver x.x.x.x#53
May 17 10:22:44 gw-rds daemon.info dnsmasq[1516]: using nameserver x.x.x.x#53
May 17 10:22:44 gw-rds daemon.info dnsmasq[1516]: using local addresses only for domain lan
May 17 10:22:53 gw-rds daemon.notice odhcp6c[6723]: (re)starting transaction on pppoe-wan
May 17 10:22:53 gw-rds daemon.notice odhcp6c[6723]: Sending SOLICIT (timeout 4294967295s)
May 17 10:22:53 gw-rds daemon.notice odhcp6c[6723]: Got a valid reply after 1ms
May 17 10:22:54 gw-rds daemon.notice odhcp6c[6723]: Sending REQUEST (timeout 4294967295s)
May 17 10:22:54 gw-rds daemon.notice odhcp6c[6723]: Got a valid reply after 1ms
May 17 10:22:54 gw-rds daemon.notice odhcp6c[6723]: entering stateful-mode on pppoe-wan
May 17 10:22:54 gw-rds daemon.notice odhcp6c[6723]: Sending <POLL> (timeout 2147483647s)
May 17 10:22:55 gw-rds daemon.notice netifd: Interface 'wan6' is now up
May 17 10:22:55 gw-rds user.notice firewall: Reloading firewall due to ifup of wan6 (pppoe-wan)
May 17 10:22:56 gw-rds daemon.warn 6relayd[6706]: Termination requested by signal.

comment:14 Changed 5 years ago by Viper <viper0508@…>

It looks ok now, maybe it was some ISP issue or had sometin to do with the fact that when it first connected the time was off with about 4 days and synced after the connect, but couldn't reproduce so far. If no one else has this issue anymore i think it can be closed. Thank you.

comment:15 Changed 4 years ago by nbd

  • Resolution set to worksforme
  • Status changed from reopened to closed

comment:16 Changed 4 years ago by neutronscott

  • Resolution worksforme deleted
  • Status changed from closed to reopened

14.07-rc1 (r41580). IPv6 works out-of-box but lasts perhaps 30min.

logread shows nothing until I did "ifdown wan6 && ifup wan6" which restores 3 default routes. ip -6 addr shows a valid_lft of 501219sec. This is much more than 30min...

Thu Jul 17 21:02:46 2014 daemon.notice odhcp6c[2037]: Starting RELEASE transaction (timeout 4294967295s, max rc 5)
Thu Jul 17 21:02:46 2014 daemon.notice odhcp6c[2037]: Send RELEASE message (elapsed 0ms, rc 0)
Thu Jul 17 21:02:46 2014 daemon.notice netifd: Interface 'wan6' is now down
Thu Jul 17 21:02:46 2014 daemon.notice netifd: Interface 'wan6' is disabled
Thu Jul 17 21:02:46 2014 daemon.notice netifd: Interface 'wan6' is enabled
Thu Jul 17 21:02:46 2014 daemon.notice netifd: Interface 'wan6' is setting up now
Thu Jul 17 21:02:46 2014 daemon.notice odhcp6c[2177]: (re)starting transaction on eth1
Thu Jul 17 21:02:46 2014 daemon.notice odhcp6c[2177]: Starting SOLICIT transaction (timeout 4294967295s, max rc 0)
Thu Jul 17 21:02:47 2014 daemon.notice odhcp6c[2177]: Got a valid reply after 139ms
Thu Jul 17 21:02:48 2014 daemon.notice odhcp6c[2177]: Starting REQUEST transaction (timeout 4294967295s, max rc 10)
Thu Jul 17 21:02:48 2014 daemon.notice odhcp6c[2177]: Send REQUEST message (elapsed 0ms, rc 0)
Thu Jul 17 21:02:48 2014 daemon.notice odhcp6c[2177]: Got a valid reply after 126ms
Thu Jul 17 21:02:48 2014 daemon.notice odhcp6c[2177]: entering stateful-mode on eth1
Thu Jul 17 21:02:48 2014 daemon.notice odhcp6c[2177]: Starting <POLL> transaction (timeout 250854s, max rc 0)
Thu Jul 17 21:02:48 2014 daemon.notice netifd: Interface 'wan6' is now up
Thu Jul 17 21:02:48 2014 user.notice firewall: Reloading firewall due to ifup of wan6 (eth1)

comment:17 Changed 4 years ago by anonymous

Could be that ipv6 forwarding=1 and thus RA solicits aren't sent by the kernel and the 1800sec timer is never updated and the routes expire?

I can't tell how the routes are added from the shell script (eventually it makes it way to netifd I'm guessing?)

Should odhcp6c be in charge of soliciting? Currently keeping a tcpdump open and watching. Will probably just crontab rdisc6 for the time being.

comment:18 Changed 3 years ago by anonymous

i have the same problem with latest Trunk as of this date.

comment:19 Changed 22 months ago by anonymous

I have this problem, too. In CC 15.05!

comment:20 Changed 20 months ago by anonymous

It looks like pppd restart (in ppp-oe mode) disables permanently ability to bring up ipv6 interfaces to the outside world.

After a reboot I have pppoe-wan and I have 6in4-he (static tunnel to Hurricane Electric) and 6to4-airnet (6to4 on static address) on that interface. Killing pppd brings all of them down and then reinstates pppoe-wan. 6in4-he and 6to4-airnet are gone, and I cannot bring them up with ifup or /etc/init.d/network restart.

Log excerpt:

Thu Jun 9 20:26:31 2016 daemon.notice netifd: Interface 'wan' is now down
Thu Jun 9 20:26:31 2016 daemon.notice netifd: Interface 'he' has lost the connection
Thu Jun 9 20:26:31 2016 daemon.notice netifd: Interface 'airnet' has lost the connection
Thu Jun 9 20:26:31 2016 daemon.notice netifd: Interface 'wan' is disabled
Thu Jun 9 20:26:31 2016 daemon.notice netifd: Interface 'wan' is enabled
Thu Jun 9 20:26:31 2016 daemon.notice netifd: Interface 'wan' is setting up now
Thu Jun 9 20:26:31 2016 daemon.notice netifd: IP tunnel '6in4-he' link is down
Thu Jun 9 20:26:31 2016 daemon.notice netifd: IP tunnel '6to4-airnet' link is down
Thu Jun 9 20:26:31 2016 daemon.notice netifd: Interface 'he' is now down
Thu Jun 9 20:26:31 2016 daemon.notice netifd: Interface 'he' is setting up now
Thu Jun 9 20:26:31 2016 daemon.notice netifd: Interface 'airnet' is now down
Thu Jun 9 20:26:31 2016 daemon.notice netifd: Interface 'airnet' is setting up now
Thu Jun 9 20:26:32 2016 daemon.info pppd[2578]: Plugin rp-pppoe.so loaded.
Thu Jun 9 20:26:32 2016 daemon.info pppd[2578]: RP-PPPoE plugin version 3.8p compiled against pppd 2.4.7
Thu Jun 9 20:26:32 2016 daemon.notice pppd[2578]: pppd 2.4.7 started by root, uid 0
Thu Jun 9 20:26:32 2016 daemon.info pppd[2578]: PPP session is 7178
Thu Jun 9 20:26:32 2016 daemon.warn pppd[2578]: Connected to 00:26:48:00:18:46 via interface eth0.127
Thu Jun 9 20:26:32 2016 kern.info kernel: [ 126.797654] pppoe-wan: renamed from ppp0
Thu Jun 9 20:26:32 2016 daemon.info pppd[2578]: Using interface pppoe-wan
Thu Jun 9 20:26:32 2016 daemon.notice pppd[2578]: Connect: pppoe-wan <--> eth0.127
Thu Jun 9 20:26:32 2016 daemon.notice netifd: Interface 'airnet' is now down
Thu Jun 9 20:26:32 2016 daemon.notice netifd: Interface 'he' is now down
Thu Jun 9 20:26:32 2016 daemon.notice pppd[2578]: CHAP authentication succeeded
Thu Jun 9 20:26:32 2016 daemon.notice pppd[2578]: peer from calling number 00:26:48:00:18:46 authorized
Thu Jun 9 20:26:32 2016 daemon.notice pppd[2578]: local LL address fe80::c026:65a7:8973:49ba
Thu Jun 9 20:26:32 2016 daemon.notice pppd[2578]: remote LL address fe80::0000:0000:00f0:1f19
Thu Jun 9 20:26:32 2016 daemon.notice pppd[2578]: local IP address 91.232.49.216
Thu Jun 9 20:26:32 2016 daemon.notice pppd[2578]: remote IP address 10.0.19.6
Thu Jun 9 20:26:32 2016 daemon.notice pppd[2578]: primary DNS address 91.232.50.10
Thu Jun 9 20:26:32 2016 daemon.notice pppd[2578]: secondary DNS address 91.232.52.10
Thu Jun 9 20:26:32 2016 daemon.notice netifd: Network device 'pppoe-wan' link is up
Thu Jun 9 20:26:32 2016 daemon.notice netifd: Interface 'wan' is now up
Thu Jun 9 20:26:33 2016 user.notice firewall: Reloading firewall due to ifup of wan (pppoe-wan)

The build is CC15.05 or possibly newer (I've built it a few months back). I will try to build from top and see if this helps.

comment:21 Changed 18 months ago by c.prevotaux@…

I have this problem too in the latest trunk

Interface looses IPv6 prefix after valid_lft expires, which in theory should never happen as when an RA comes ODHCP6C should refresh the value and not let it expire.

comment:22 Changed 18 months ago by nighty

comment:23 Changed 18 months ago by nighty

The workaround for this is :

echo 0 > /sys/devices/virtual/net/br-lan/bridge/multicast_snooping

It is an old bug (still present) in kernel 4.4.15 (at least)

Add Comment

Modify Ticket

Action
as reopened .
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.