Modify

Opened 5 years ago

Last modified 3 years ago

#13442 reopened defect

No link on eth1 (wan) if ethernet is connected after boot

Reported by: Robert Grønning <slimg@…> Owned by: developers
Priority: high Milestone: Barrier Breaker 14.07
Component: kernel Version: Trunk
Keywords: link boot wan Cc: it@…

Description

When I boot my Dlink DIR-825, it checks on link on eth1 (wan), if there is link, it continues and all is well in the world.

If however, it does not get link during boot, it seems to disable eth1.

To get it working again, I either have to:

  1. Unplug the ethernet cable, and then plug it in again.

or

  1. Run "mii-tool --restart eth1"

I have 28 of these boxes, and many of them will not come back to life after power-failures because the switch they are connected to boot slower than the DIR-825, thus it is not ready to provide link when the DIR-825 is negotiating for it during boot.

I've tried connecting eth1 to a Netgear Switch, HP Procurve Switch and my Intel desktop NIC without any difference in behavior.

I've tried the following firmwares, and all of them has the same behavior as described above.

  • OpenWRT Backfire 10.03.1 for DIR 825 without any config changes.
  • OpenWRT AA 12.09 for DIR 825 without any config changes.
  • OpenWRT BB R36083 for DIR 825 with config changes.

Attached is the dmesg output after booting without ethernet cable plugged in, the two last lines appeared as soon as I had plugged in the ethernet cable twice.

I found something related to ethernet speed when looking at ethtool output while plugging the cables in and out.

Here is the output of "ethtool eth1" after booting wihout eth1 connected:

Settings for eth1:
        Supported ports: [ TP MII ]
        Supported link modes:   10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Half 1000baseT/Full 
        Supported pause frame use: No
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Half 1000baseT/Full 
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Speed: 10Mb/s
        Duplex: Half
        Port: MII
        PHYAD: 4
        Transceiver: external
        Auto-negotiation: on
        Current message level: 0x000000ff (255)
                               drv probe link timer ifdown ifup rx_err tx_err
        Link detected: no

Here is the output after eth1 has been connected:

Settings for eth1:
        Supported ports: [ TP MII ]
        Supported link modes:   10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Half 1000baseT/Full 
        Supported pause frame use: No
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Half 1000baseT/Full 
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Speed: 100Mb/s
        Duplex: Full
        Port: MII
        PHYAD: 4
        Transceiver: external
        Auto-negotiation: on
        Current message level: 0x000000ff (255)
                               drv probe link timer ifdown ifup rx_err tx_err
        Link detected: no

The output after eth1 has been diconnected is the same as the first output after boot without ethernet connected.

Here is the output after eth1 has been connected the second time:

Settings for eth1:
        Supported ports: [ TP MII ]
        Supported link modes:   10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Half 1000baseT/Full 
        Supported pause frame use: No
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Half 1000baseT/Full 
        Advertised pause frame use: No
        Advertised auto-negotiation: Yes
        Speed: 1000Mb/s
        Duplex: Full
        Port: MII
        PHYAD: 4
        Transceiver: external
        Auto-negotiation: on
        Current message level: 0x000000ff (255)
                               drv probe link timer ifdown ifup rx_err tx_err
        Link detected: yes

What I can see is that "Speed" and "Link detected" is the only changing variables.

The ethernet port I'm connecting to in the other end is a autonegotiating 100Mb/1000Mb port.

Attachments (1)

dmesg (11.9 KB) - added by Robert Grønning <slimg@…> 5 years ago.

Download all attachments as: .zip

Change History (19)

Changed 5 years ago by Robert Grønning <slimg@…>

comment:1 Changed 5 years ago by raver@…

I have the same problem with my Netgear WNDR3700v2, tested with Trunk r36500.

comment:2 Changed 5 years ago by raver@…

I noticed that this issue only happens with gigabit devices.

comment:3 Changed 5 years ago by Robert Grønning <slimg@…>

The driver for this nic seem to be "ag71xx"

$ ethtool -i eth1
driver: ag71xx
version: 0.5.35
firmware-version: 
bus-info: ag71xx.1
supports-statistics: no
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no

comment:4 follow-up: Changed 5 years ago by Robert Grønning <slimg@…>

I see it seems possible to compile OpenWRT with debugging enabled in the ag71xx driver

https://dev.openwrt.org/browser/trunk/target/linux/ar71xx/files/drivers/net/ag71xx/Kconfig?rev=19032

But I do not know how to set this particular flag, can someone instruct me on where to find this in "make menuconfig", or what line in the makefile I need to edit before compiling?

comment:5 in reply to: ↑ 4 Changed 5 years ago by raver@…

Replying to Robert Grønning <slimg@…>:

But I do not know how to set this particular flag, can someone instruct me on where to find this in "make menuconfig", or what line in the makefile I need to edit before compiling?

I think you must run 'make kernel_menuconfig' and go to Device Drivers -> Network device support -> Ethernet driver support.
BTW do you have issues connecting to 10/100 ports?

comment:6 Changed 5 years ago by Robert Grønning <slimg@…>

I have no issue like this with the four 10/100 ports.

The Atheros AR7161 seem to have two built-in Gbit ports, I assume one of them (eth0?) is constantly connected to the internal RTL8366S switch in the DIR-825, and thus won't have the problem of no link, and the other (eth1?) is the port marked "WAN" on the DIR-825 box.

I've compiled a new firmware with the debugging flags AG71XX_DEBUG and AG71XX_DEBUG_FS. I'll try this image tomorrow and output whatever I discover here.

comment:7 Changed 5 years ago by raver@…

Replying to Robert Grønning <slimg@…>:

I have no issue like this with the four 10/100 ports.

I guess I was too vague about gigabit and non gigabit ports. The problem only occurs when connecting the router WAN port (eth1) to another gigabit port.
Good luck with the debugging. I'll try to do some too.

comment:8 follow-up: Changed 5 years ago by Robert Grønning <slimg@…>

I can't see any extra output from AG71XX_DEBUG in dmesg, and AG71XX_DEBUG_FS seem to only output NIC statistics.

Anyone got any clues or help?

comment:9 in reply to: ↑ 8 Changed 5 years ago by raver@…

Replying to Robert Grønning <slimg@…>:

I can't see any extra output from AG71XX_DEBUG in dmesg, and AG71XX_DEBUG_FS seem to only output NIC statistics.

Got stuck here too.

comment:10 Changed 5 years ago by Robert Grønning <slimg@…>

I still struggle with having to fix the routers that were connected to 1Gb switches during powercuts. Can anyone help?

comment:11 Changed 5 years ago by Robert Grønning <slimg@…>

Bump

comment:12 Changed 5 years ago by Robert Grønning <slimg@…>

I still struggle with having to fix the routers that were connected to 1Gb switches during powercuts. Can anyone help?

comment:13 Changed 4 years ago by Robert Grønning <slimg@…>

Bump

comment:14 Changed 4 years ago by jow

  • Milestone changed from Attitude Adjustment 12.09 to Barrier Breaker 14.07

Milestone Attitude Adjustment 12.09 deleted

comment:15 Changed 3 years ago by chipmanly@…

I was experiencing a similar issue with a different router (tp-link archer c7 ac1750) that has the same NIC driver (ag71xx). This thread helped me resolve the issue. I took your second workaround (in my case: mii-tool --restart eth0) and included it in my /etc/rc.local. Thanks for pointing me in the right direction.

comment:16 Changed 3 years ago by nbd

please try current trunk (with a clean kernel)

comment:17 Changed 3 years ago by nbd

  • Resolution set to no_response
  • Status changed from new to closed

comment:18 Changed 3 years ago by roysjosh@…

  • Resolution no_response deleted
  • Status changed from closed to reopened

I see this on the latest BB, clean build. WRT400N. This leads to dmesg like:

[  667.200000] ar71xx: pll_reg 0xb8050014: 0x1099
[  667.200000] eth1: link up (100Mbps/Full duplex)
[  669.200000] eth1: link down
[  670.200000] ar71xx: pll_reg 0xb8050014: 0x1099
[  670.200000] eth1: link up (100Mbps/Full duplex)
[  672.200000] eth1: link down
[  677.200000] ar71xx: pll_reg 0xb8050014: 0x1099
[  677.200000] eth1: link up (100Mbps/Full duplex)
[  679.200000] ar71xx: pll_reg 0xb8050014: 0x991099
[  679.200000] eth1: link up (10Mbps/Full duplex)

You can see it tries everything but 1000Mbps...

Add Comment

Modify Ticket

Action
as reopened .
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.