Modify

Opened 2 years ago

Closed 2 years ago

#20313 closed defect (worksforme)

mac80211 from 2015-07-21 causes DUPs

Reported by: anonymous Owned by: developers
Priority: normal Milestone:
Component: packages Version: Trunk
Keywords: Cc:

Description

The version r46435 works fine and problem started with r46436

An adhoc + batman-adv (also happens with other routing protocols) setup was tested with the new version on different ath9k based hardware (ar933x/ar934x). It was noticed that the packet loss was higher than before, latency increased a lot from time to time and at the same time duplicated packets appeared (can be tested with simple pings) .

I would say that the problem may be related with the patch package/kernel/mac80211/patches/303-ath9k-add-fast-xmit-support.patch which was hidden in the same OpenWrt commit as the upgrade to mac80211 2015-07-21. It is still possible that a different change causes this problem. But at least this change enabled a "complete" different tx path.

Could these extra patches be added next time in extra commits to make bisecting easier? This also means patches like:

  • package/kernel/mac80211/patches/303-ath9k-add-fast-xmit-support.patch
  • package/kernel/mac80211/patches/304-ath9k-remove-struct-ath_atx_ac.patch
  • package/kernel/mac80211/patches/305-ath9k-remove-the-sched-field-in-struct-ath_atx_tid.patch
  • package/kernel/mac80211/patches/306-mac80211-Deinline-rate_control_rate_init-rate_contro.patch
  • package/kernel/mac80211/patches/306-mac80211-Deinline-rate_control_rate_init-rate_contro.patch
  • package/kernel/mac80211/patches/307-mac80211-Deinline-drv_sta_state.patch
  • package/kernel/mac80211/patches/308-ath9k-Fix-NF-CCA-limits-for-AR9287-and-AR9227.patch

A similar thing was also tested which may be related to the same problem:

Laptop (eth0; MAC 00:aa:aa:aa:aa:01)

        |
    ethernet
        |

1. AR933x device with OpenWrt, eth0, MAC 00:aa:aa:aa:aa:99; wlan0: 00:aa:aa:aa:aa:02

        |
      adhoc
        |

2. AR933x device with OpenWrt, wlan0: 00:aa:aa:aa:aa:99

This should not work because the eth0 of the laptop has a different mac address than the adhoc interface on the first AR933x device. Still packets were transmitted (DUPs seen) with r46436. Packets were correctly blocked (as expected) with r46435

Attachments (1)

.config (93.4 KB) - added by anonymous 2 years ago.
r46436 openwrt config

Download all attachments as: .zip

Change History (8)

comment:1 Changed 2 years ago by nbd

can you try to isolate if the problems were caused by the transmitter or the receiver side (by updating them individuallyl)?

comment:2 Changed 2 years ago by anonymous

Same problem here. Only sender device was updated and still the problem happened

comment:3 Changed 2 years ago by nbd

Please show me your wireless config. Also please test if the problems still occur if you remove patches 303-305

comment:4 Changed 2 years ago by nbd

also, please try r47042

Changed 2 years ago by anonymous

r46436 openwrt config

comment:5 Changed 2 years ago by anonymous

I am currently trying to reproduce the report from Krishna on the battlemesh
without success. Maybe it was easier to trigger there because of all other
devices which used the same channels.

But this doesn't explain the claim that r46436 introduced a problem which
allowed the devices to receive packets which were not for them. But I can
clearly see that they send packets which are not from them (00:aa:aa:aa:aa:02
sends packets from 00:aa:aa:aa:aa:01). But this is also the case for r46435.

I've configured two OM2P as explained below. The used versions are:

wireless node 1 (trelay):

config wifi-device  radio0
        option type     mac80211
        option channel  11
        option hwmode   11g
        option path     'pci0000:00/0000:00:00.0'
        option htmode   HT20
	option txpower 5

config wifi-iface
        option device   radio0
        option network  meshnet
        option mode     adhoc
        option ssid 'mesh'
        option encryption none
        option bssid '02:CA:FE:CA:CA:40'
        option macaddr '00:aa:aa:aa:aa:02'

wireless node 2 (receiver):

config wifi-device  radio0
        option type     mac80211
        option channel  11
        option hwmode   11g
        option path     'pci0000:00/0000:00:00.0'
        option htmode   HT20
	option txpower 5

config wifi-iface
        option device   radio0
        option network  meshnet
        option mode     adhoc
        option ssid 'mesh'
        option encryption none
        option bssid '02:CA:FE:CA:CA:40'
        option macaddr '00:aa:aa:aa:aa:99'

network node1

config interface 'loopback'
        option ifname 'lo'
        option proto 'static'
        option ipaddr '127.0.0.1'
        option netmask '255.0.0.0'

config interface 'lan'
        option ifname 'eth0'
        option force_link '1'
        option proto 'static'
        option macaddr '00:aa:aa:aa:aa:99'

config interface 'meshnet'
        option ifname ''
        option force_link '1'
        option proto 'static'

config interface 'wan'
        option ifname 'eth1'
        option proto 'dhcp'

config interface 'wan6'
        option ifname 'eth1'
        option proto 'dhcpv6'

network node2

config interface 'loopback'
        option ifname 'lo'
        option proto 'static'
        option ipaddr '127.0.0.1'
        option netmask '255.0.0.0'

config interface 'lan'
        option ifname 'eth0'
        option force_link '1'
        option proto 'static'

config interface 'meshnet'
        option ifname ''
        option force_link '1'
        option proto 'static'
        option ipaddr '192.168.2.3'
        option netmask '255.255.255.0'

config interface 'wan'
        option ifname 'eth1'
        option proto 'dhcp'

config interface 'wan6'
        option ifname 'eth1'
        option proto 'dhcpv6'

trelay was started manually on node1 after the boot

echo "eth0-wlan0,eth0,wlan0" > /sys/kernel/debug/trelay/add
# must be removed before wlan0 is removed: echo 1 > /sys/kernel/debug/trelay/eth0-wlan0/remove

Later I've dumped packets on both using wlan0 and mon0:

# just to test if adhoc would allow to receive packets with different dest (no, didn't work. even with mon0 enabled): ifconfig wlan0 promisc

tcpdump -nxxxxxx -i wlan0
iw phy phy0 interface add mon0 type monitor
ifconfig mon0 up
tcpdump -nxxxxxx -i mon0

comment:6 Changed 2 years ago by anonymous

Just as reference: http://lxr.free-electrons.com/source/net/mac80211/rx.c#L3304

This is the code which should prevent receiving of frames for other MACs. The IFF_PROMISC flag is not checked here and thus the setup mentioned in the ticket description should not be possible at all. The only way to still get it to work is to manually patch the driver

--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -3306,7 +3306,7 @@ static bool ieee80211_accept_frame(struct ieee80211_rx_data *rx)
 		if (!ieee80211_bssid_match(bssid, sdata->u.ibss.bssid))
 			return false;
 		if (!multicast &&
-		    !ether_addr_equal(sdata->vif.addr, hdr->addr1))
+		    0)
 			return false;
 		if (!rx->sta) {
 			int rate_idx;

and to run on the trelay node in the middle

iw phy phy0 interface add mon0 type monitor
ifconfig mon0 up

I would therefore say that the test setup mentioned in the ticket desription is not useful in reproducing the DUP problems during the battlemesh v8.

comment:7 Changed 2 years ago by nbd

  • Resolution set to worksforme
  • Status changed from new to closed

Add Comment

Modify Ticket

Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.