Modify

Opened 7 years ago

Last modified 2 years ago

#8701 new defect

bridging two VLANs "swallows" IP packets on bridge

Reported by: sborilla Owned by: developers
Priority: high Milestone: Chaos Calmer 15.05
Component: packages Version: Backfire 10.03.1 RC4
Keywords: VLAN bridge TL-WR1043ND IP packet loss Cc:

Description

I'm trying to bridge two VLANs on an TL-WR1043ND.
The /etc/config/network - file looks like this:

config 'interface' 'loopback'
    option 'ifname' 'lo'
    option 'proto' 'static'
    option 'ipaddr' '127.0.0.1'
    option 'netmask' '255.0.0.0'

config 'interface' 'lan'
    option 'ifname' 'eth0.1'
    option 'type' 'bridge'
    option 'proto' 'static'
    option 'ipaddr' '192.168.1.1'
    option 'netmask' '255.255.255.0'

config 'interface' 'wan'
    option 'ifname' 'eth0.2'
    option 'proto' 'dhcp'

config 'switch'
    option 'name' 'rtl8366rb'
    option 'reset' '1'
    option 'enable_vlan' '1'

config 'switch_vlan'
    option 'device' 'rtl8366rb'
    option 'vlan' '1'
    option 'ports' '3 4 5t'

config 'switch_vlan'
    option 'device' 'rtl8366rb'
    option 'vlan' '2'
    option 'ports' '0 5t'

config 'switch_vlan'
    option 'device' 'rtl8366rb'
    option 'vlan' '3'
    option 'ports' '1 5t'

config 'switch_vlan'
    option 'device' 'rtl8366rb'
    option 'vlan' '4'
    option 'ports' '2 5t'

config 'interface' 'sniffer'
    option 'type' 'bridge'
    option 'ifname' 'eth0.3 eth0.4'
    option 'defaultroute' '0'
    option 'proto' 'none'
    option 'peerdns' '0'

The defect:
while STP-packets and all other broadcasts (including ARP-broadcasts) do flow between both VLANs correctly (on eth0.3 and eth0.4), IP packets that are directed at specific IP addresses (like DHCP requests or ICMP ping responses) are stranded on the bridge.

A tcpdump shows that they do arrive correctly at the bridge (br-sniffer), but they do not leave the device ever again and do not reach their target destination.

Also adding a rule of

iptables -A FORWARD -i eth0.3 -o eth0.4 -j ACCEPT
iptables -A FORWARD -i eth0.4 .o eth0.3 -j ACCEPT

does not change the situation at all.
For whatever reason, those packets never leave the bridge again.

I was at first assuming a configuration mistake, but must now assume a defect in the system.
See https://forum.openwrt.org/viewtopic.php?id=28218

Attachments (0)

Change History (17)

comment:1 Changed 7 years ago by sborilla

Clarification:
The phrase "(like DHCP requests or ICMP ping responses)" should read "(like DHCP responses or ICMP ping responses)".

The DHCP requests are certainly broadcasts as well, and like all broadcasts, they do work using the above configuration. It's just that they never get a response from the DHCP server, because, as said, the bridge "swallows" them.

comment:2 Changed 7 years ago by sborilla

One more addition:

This seems to be caused by the switch driver of the rtl8366rb.
I tried the very same configuration on a broadcom-device with broadcom's BCM5325 roboswitch, and here it worked.

This convinces me that the configuration described above should work, and the fact that it doesn't was caused by buggy realtek-switch-drivers.

comment:3 Changed 7 years ago by sKAApGIF <skaapgif@…>

I can confirm that bridging two vlan's does not work. PPP frames also do not cross the bridge. This also did not work on a dir-300 with an IC+ IP175C switch similar to #8653

comment:4 Changed 7 years ago by sKAApGIF <skaapgif@…>

#7955 sound like a similar problem

comment:5 follow-up: Changed 7 years ago by m+openwrt@…

Same problem on a WZR-HP-G300NH with an rtl8366s switch.

comment:6 in reply to: ↑ 5 Changed 7 years ago by anonymous

Same problem on a WZR-HP-G301NH with a rtl8366rb switch.

comment:7 Changed 6 years ago by martin.faecknitz@…

Your problem is not a bug. The solution is:

config 'switch_vlan'
    option 'device' 'rtl8366rb'
    option 'fid' '0'
    option 'vlan' '3'
    option 'ports' '1 5t'

config 'switch_vlan'
    option 'device' 'rtl8366rb'
    option 'fid' '1'
    option 'vlan' '4'
    option 'ports' '2 5t'

Needed patch:

--- package/swconfig/files/switch.sh 2011-03-17 02:12:16.000000000 +0100
+++ package/swconfig/files/switch.sh 2012-02-14 19:06:54.000000000 +0100
@@ -6,6 +6,19 @@
        name="${name:-$1}"
        [ -d "/sys/class/net/$name" ] && ifconfig "$name" up
        swconfig dev "$name" load network
+
+       config_foreach fid_switch_vlan switch_vlan
+}
+
+fid_switch_vlan() {
+       local fid vlan
+
+       config_get fid "$1" fid
+       config_get vlan "$1" vlan
+
+       if [[ -n "$fid" ]]; then
+               swconfig dev $name vlan $vlan set fid $fid
+       fi
 }
 
 setup_switch() {

Rebuild openwrt and everything should work now.

comment:8 follow-up: Changed 6 years ago by mwarning

@martin: I got the same problem and tried your patch on my TL-WR841ND (with a minimal different switch setup) but it didn't work. Is the 'fid' related to the 'vlan' value maybe?

These are the swconfig calls done by your patch:

swconfig dev eth0 vlan 1 set fid 0
swconfig dev eth0 vlan 2 set fid 1

comment:9 in reply to: ↑ 8 Changed 6 years ago by anonymous

Replying to mwarning:
I guess something changed meanwhile in openwrt. The correct way should be

fid_switch_vlan() {
        local fid vlan

        config_get fid "$1" fid
        config_get vlan "$1" vlan
        config_get device "$1" device

        if [[ -n "$fid" ]]; then
                swconfig dev $device vlan $vlan set fid $fid
        fi
}

which calls swconfig dev rtl8366rb vlan 1 set fid 0 ...

comment:10 Changed 4 years ago by stefan@…

Is it possible by now to specify the "fid" using /etc/config/network?

Are there any downsides to the patches above that prevent them from being included into OpenWrt?

comment:11 Changed 4 years ago by jow

  • Milestone changed from Backfire 10.03.2 to Chaos Calmer (trunk)

Milestone Backfire 10.03.2 deleted

comment:12 Changed 2 years ago by mnlipp

The problem is not solved.

I have installed 15.05 on a Buffalo WZR-HP-G450H and experience the problem described here. I can reliably reproduce that dhcp responses and arp replies are lost when I add a second VLAN to the bridge.

I tried to use the mentioned workaround by setting "fid"s, but

swconfig dev switch0 vlan 1 set fid 0

just outputs: Unknown attribute "fid"

comment:13 follow-up: Changed 2 years ago by jow

The problem cannot be solved expect on switches having multiple forward tables. Yours does not.

comment:14 in reply to: ↑ 13 Changed 2 years ago by mnlipp

Replying to jow:

The problem cannot be solved expect on switches having multiple forward tables. Yours does not.

Okay, but then it would be a good idea to mention that on the router's page (https://wiki.openwrt.org/toh/buffalo/wzr-hp-g450h). After all, my switch hardware IS mentioned on the page that explains swconfig (https://wiki.openwrt.org/doc/techref/swconfig). I refreshed some knowledge and gained some more, but nevertheless it took me the best part of the last two days to finally understand the problem and get to this bug report.

Don't get me wrong, I'm very grateful for the great work people have put into OpenWrt, but some hint would have been really helpful here.

comment:15 follow-up: Changed 2 years ago by jow

The model specific wiki page would be the wrong place to put such a notice as this hardware limitation affects most models.

comment:16 in reply to: ↑ 15 Changed 2 years ago by mnlipp

Replying to jow:

The model specific wiki page would be the wrong place to put such a notice as this hardware limitation affects most models.

So, where can I find the information? If, e.g. I'd want to buy a new router that supports bridging VLANs?

comment:17 Changed 2 years ago by Brain2000

My WNDR3700 supports setting the fid with the swconfig function. However, it does appear that the ability to use the "option fid" in the /etc/config/network does not exist yet as if 15.05. I manually put in the patch above, and it set the fid properly.

Add Comment

Modify Ticket

Action
as new .
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.