Modify

Opened 7 years ago

Closed 7 years ago

Last modified 4 years ago

#9646 closed defect (fixed)

WZR-HP-G300NH: deauthenticated due to local deauth request

Reported by: p.f.j.geraedts@… Owned by: nbd
Priority: normal Milestone: Barrier Breaker 14.07
Component: base system Version: Trunk
Keywords: Cc:

Description

Specific test setup:

AP: OpenWrt r27153 / Buffalo WZR-HP-G300NH / Atheros AR9001-3NG;
STA: Ubuntu 11.04 / MSI Wind U100 / Realtek RTL8187SE.

This access point is serving all my stations without problems for a while now except of this specific setup. While previously using Backfire 10.03.1-rc4 I already encountered a freeze of the connection every few hours, with the characteristic 'deauthenticated due to local deauth request' messages popping up in the log. Afterwards I need to reestablish the connection manually. Recently I upgraded to trunk (r27153) and observed a significant increase in these connection drops: they now occur every so many minutes. I'm not sure what is causing this increase.

During testing I ran a split-screen session from another (ethernet-connected) computer:

hostapd -dd -t /var/run/hostapd-phy0.conf 2>&1 | tee /mnt/sda1/tmp/hostapd.log | grep -v "Add randomness:"
tcpdump -ni wifi0.mon0 -s 3000 -w /mnt/sda1/tmp/hostapd.pcap

Some things that struck me as odd:

  • I added grep to get rid of the lack of entropy logging. The kernel seems to have a serious lack of entropy. (Should I file another ticket for this?);
  • The four-way handshake happens not once but twice for some reason;
  • The station only sends all-zero nonces;
  • The reaction of the station during the group key handshake seems to be the main problem.

Although various things seem to play a role in the observed behavior, after some more extensive testing I think I can at least distinguish 3 different reproducible situations: 1) a connection is established but not used, while no other stations are connected; 2) a connection is established and actively used, while no other stations are connected; 3) a connection is established and actively used, while other stations are connected.

I haven't observed a single connection drop in case of 1). In case of 2) the first group key handshake seems to drop the connection (this is the situation of which I attached the log fles). In case of 3) the connection is typically dropped at a later group key handshake. During the failing group key handshake the station does not send EAPOL frames, but I have seen it sent QoS packets. Could I be seeing artifacts from the QoS patch that resulted from ticket #8830?

Some remaining observations:

  • the reported behavior also occurs with this (ath5k) AP: OpenWrt r27153 / FON 2202 / Atheros AR2315;
  • the reported behavior does not occur with this AP: OpenWrt r27153 / Linksys WRT54GL / Broadcom 5352;
  • the stock Windows XP SP3 driver is working without problems.

Unrelated to all this: I'm planning to replace the Realtek RTL8187SE with an Atheros AR9287 (mostly to speed up the connection). I'm happy to do some further testing in the short term. Afterwards I could donate the mini-PCIe card for further
testing.

I think that's about it. Let me finish by saying that I'm amazed by the high quality of the current state of the OpenWrt distribution. It really is an impressive piece of FOSS!

Thanks,

Paul

Attachments (3)

hostapd.log.tgz (227.5 KB) - added by p.f.j.geraedts@… 7 years ago.
hostapd.log.zip (3.7 KB) - added by Dennis Oberhoff <dennisoberhoff@…> 7 years ago.
Hostapd Logs
hostapd.log (124.3 KB) - added by AndreZ 6 years ago.
A new log.

Download all attachments as: .zip

Change History (61)

Changed 7 years ago by p.f.j.geraedts@…

comment:1 Changed 7 years ago by p.f.j.geraedts@…

P.S. Please email me to receive the hostapd.pcap file; it is too big to attach it here.

comment:2 Changed 7 years ago by jow

  • Owner changed from developers to nbd
  • Status changed from new to assigned

comment:3 in reply to: ↑ description Changed 7 years ago by anonymous

Replying to p.f.j.geraedts@…:

Some things that struck me as odd:

  • I added grep to get rid of the lack of entropy logging. The kernel seems to have a serious lack of entropy. (Should I file another ticket for this?);

I was concerned about lack of entropy as well (Increase entropy sources for ar71xx devices Ticket: #9631 )

Support for entropy generation in network drivers is apparently being deprecated linux-netdev thread though some broadcom devices might still support IRQF_SAMPLE_RANDOM.

timer_entropyd might provide a workaround though it got a 10% CPU usage.

comment:4 Changed 7 years ago by nbd

Please send the capture file to nbd@…

Also, considering this weird client behavior I'd like to know what the Broadcom AP does differently, not triggering the disconnect issues. Did you figure out any differences there by capturing traffic?

Thanks

comment:5 Changed 7 years ago by anonymous

OK, time for an update. First of all: in my original post situation 1) should have read 'a connection is established but not used', i.e. independent if other stations are connected or not. Anyhow, I think I have tried a bit too hard to figure out a (too) simple system behind the connection drops, while being a bit sloppy in the process.

That is, during initial testing I checked both the .log and the .pcap files to figure out a system. Later on I mostly only relied on the .log file as it seemed to contain all the information. This now turns out to be not true in general. More is going on..

@nbd: The specific .pcap file I planned to attach here (and will send you in a minute) is a good example of this. It seems that the AP does not send out EAPOL packets when the .log file says it initiates a group key handshake (or at least Wireshark's EAPOL filter doesn't catch them).

Furthermore, have you got any idea about these things:

  • More (all?) of my STAs only sends all-zero nonces. Does hostapd hide them in the log or are they really zero? If so, the security is only based on the nonces produced by the AP?
  • What could be the reason of the second four-way handshake? In that handshake steps 3 and 4 are actually in reverse-order!? A really buggy STA driver or something?

Upto now I did not have a detailed look at the Linksys AP behavior, mainly because I seem unable to use its uSD slot when the wifi is up. I'll figure out a small-footprint way to get the data off the device and will perform some tests afterwards. (Maybe AoE can do the job if I think about it..) Will get back to you on this one..

Anyhow, what I've seen you're a real expert on this stuff. Upto sofar I don't feel I have gotten very far with my testing, so please feel free to give me specific instructions to test for you. I think that's the most efficient way.

@anonymous: thanks for the info! I'm hesitant to believe that something like timer_entropyd can provide a good solution though. It seems to me that such a solution measures or the EMC induced spread spectrum behavior of the clocks involved, or at best, the thermal-runaway induced long-term jitter of clocks; both can contain strong harmonic content. I simply don't understand why hardware manufacturers do not add a good analog hardware source of entropy in their products in general. I'm pretty sure the cost can be near-zero, both in chip area, power consumption and especially CPU usage.

Hostapd seems to take care of its own entropy, although it does not seem to feed it back to the kernel. I can't help worrying about the rest of user-land (e.g. OpenSSL)..

comment:6 Changed 7 years ago by nbd

From what I can see in the pcap file, EAPOL frames for GTK rekeying are sent (and acknowledged by the client). The reason why they are hard to spot is that GTK rekey related EAPOL frames are encrypted, so wireshark does not attempt to decode the data part of the frames when the header has the encryption flag.
So far I haven't seen the AP do anything wrong in that file...

comment:7 Changed 7 years ago by nbd

please try copying http://nbd.name/760-eapol_qos_high_priority.patch to package/hostapd/patches and check if it improves stability.

comment:8 Changed 7 years ago by p.f.j.geraedts@…

@nbd: thanks for clearing that up! You're right, looking more carefully at the timestamps frames are sent/received at the time of rekeying. I'm glad the AP does function as it should there.

Could you also have a look at the failing reconnection attempts after the unsuccessful GTK rekeying? The STA does seem to reply, but the reply is encrypted at that time. Is that normal? (BTW, I forgot to send you the key last time.. sorry about that.. I'll sent it to you in a minute. Feel free to use it whenever you see fit; it's only there to keep the neighbors off the internet.. :)

I hope to find some time this weekend to do some more in-depth testing with the Linksys router. Just odd that I haven't observed the disconnect issue with that one. (Really hope I tested for sufficiently long..)

Thanks for the patch. Applying it is next on my todo list.

comment:9 Changed 7 years ago by anonymous

I'm also affected by this problem with my WNDR3700 (27571). Afaik your patch has already been submitted into the Trunk and my version should already include it. This behavour mostly happens with my iPhone4.

Jul  9 10:35:38 WNDR3700 daemon.info hostapd: wlan0: STA xx:xx:xx:xx:xx:xx IEEE 802.11: deauthenticated due to local deauth request

comment:10 Changed 7 years ago by nbd

This message is useless without the context surrounding it. Please make a proper logfile from the output of hostapd -dd /var/run/hostapd-phy0.conf (after killing the previous hostapd instance).

comment:11 Changed 7 years ago by p.f.j.geraedts@…

Today I did some more testing on the Linksys device. Turns out it triggers the disconnect issue just like the other 2 devices. What seems to have happened is that the original testing took place before I observed that a certain amount of traffic is necessary to trigger the disconnect at GTK rekeying. (Actually, it now seems I only accidentally triggered it on the ath5k device.. : ) Anyhow, I'm glad that that's all sorted out now..

Tomorrow I'll find some time to apply the patch and test if it solves the problem with this specific STA. I'll start with the Linksys device.

Changed 7 years ago by Dennis Oberhoff <dennisoberhoff@…>

Hostapd Logs

comment:12 Changed 7 years ago by Dennis Oberhoff <dennisoberhoff@…>

5c:59:48:19:40:2c is the Device that keeps disconnecting.

comment:13 Changed 7 years ago by nbd

dennis, what kind of device is this, what openwrt version, with or without my patch?

comment:14 Changed 7 years ago by Dennis Oberhoff <dennisoberhoff@…>

Well I was the anoynmous some posts above. It's an IPhone4 device. The Router is an wndr3700v1 with r27570. Afaik should it already contain your patch?

comment:15 follow-up: Changed 7 years ago by p.f.j.geraedts@…

@Dennis: Have you tried the latest IOS 4.3.3? I don't get any disconnects with an iPad running 4.3.3. Just like the iPhone 4 it contains the BCM4329, although actually it seems to be some sort of dual-band variant of it.. (BCM4329 variants) Anyhow, it may be worth a try..

STA: IOS 4.3.3 / Apple iPad / Broadcom BCM4329

comment:16 in reply to: ↑ 15 Changed 7 years ago by Dennis

Replying to p.f.j.geraedts@…:

@Dennis: Have you tried the latest IOS 4.3.3? I don't get any disconnects with an iPad running 4.3.3. Just like the iPhone 4 it contains the BCM4329, although actually it seems to be some sort of dual-band variant of it.. (BCM4329 variants) Anyhow, it may be worth a try..

STA: IOS 4.3.3 / Apple iPad / Broadcom BCM4329

It#s 4.3.3. This also happens with an iPad. Also I have noticed stuttering while using AirPlay from my Macbook over the Network. Furthermore seems to kill TimeMaschine my whole connectivity from MacBook

comment:17 Changed 7 years ago by Paul Geraedts <p.f.j.geraedts@…>

First of all: sorry for the delay. I'm afraid the patch did not solve the connection drops.

I've recompiled OpenWrt r27153, but now including 760-eapol_qos_high_priority.patch and flashed the Linksys WRT54GL with it. No observable changes in hostapd behavior though..

comment:18 Changed 7 years ago by suspend@…

Hi,
please help me with install this patch. I doesn't do this yet. Manual on http://wiki.openwrt.org/doc/devel/patches I dont understand. Please step-by-step.
Thanks

I have same issues like describet above. I have about 15 clients, Ovislink 5460 and some of them sometimes writing to log like down. But sometimes they are connected without problems. I found this discusion https://forum.openwrt.org/viewtopic.php?id=28278 where is solution with set longest time with eapol_key_timeout_first, but when I try this with option max_listen_int (from openwrt doc) it was same problem with deauthenticated due to local deauth request. What patch above, working?

Jul 12 17:58:48 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 IEEE 802.11: authenticated
Jul 12 17:58:48 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 IEEE 802.11: associated (aid 5)
Jul 12 17:58:53 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 IEEE 802.11: deauthenticated due to local deauth request
Jul 12 17:59:06 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 IEEE 802.11: authenticated
Jul 12 17:59:06 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 IEEE 802.11: associated (aid 5)
Jul 12 17:59:06 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 WPA: received EAPOL-Key 2/4 Pairwise with unexpected replay counter
Jul 12 17:59:09 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 IEEE 802.11: deauthenticated due to local deauth request
Jul 12 17:59:10 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 IEEE 802.11: authenticated
Jul 12 17:59:10 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 IEEE 802.11: associated (aid 5)
Jul 12 17:59:11 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 WPA: pairwise key handshake completed (WPA)
Jul 12 17:59:11 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 WPA: group key handshake completed (WPA)
Jul 12 18:00:18 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 IEEE 802.11: authenticated
Jul 12 18:00:18 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 IEEE 802.11: associated (aid 5)
Jul 12 18:00:22 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 IEEE 802.11: deauthenticated due to local deauth request
Jul 12 18:00:22 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 IEEE 802.11: authenticated
Jul 12 18:00:29 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 IEEE 802.11: associated (aid 5)
Jul 12 18:00:32 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 IEEE 802.11: deauthenticated due to local deauth request
Jul 12 18:00:33 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 IEEE 802.11: authenticated
Jul 12 18:00:33 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 IEEE 802.11: associated (aid 5)
Jul 12 18:00:35 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 WPA: received EAPOL-Key 2/4 Pairwise with unexpected replay counter
Jul 12 18:00:36 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 WPA: received EAPOL-Key 2/4 Pairwise with unexpected replay counter
Jul 12 18:00:38 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 IEEE 802.11: deauthenticated due to local deauth request
Jul 12 18:00:40 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 IEEE 802.11: authenticated
Jul 12 18:00:40 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 IEEE 802.11: associated (aid 5)
Jul 12 18:00:41 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 WPA: pairwise key handshake completed (WPA)
Jul 12 18:00:41 OpenWrt daemon.info hostapd: wlan0: STA 00:4f:62:14:03:37 WPA: group key handshake completed (WPA)

comment:19 Changed 7 years ago by nbd

Please try r27607 or newer. I've received reports that it improves stability with intel clients a bit.

comment:20 Changed 7 years ago by sam.right@…

According to this link

http://translate.google.com/translate?u=http%3A%2F%2Fwww.openwrt.org.cn%2Fbbs%2Fviewthread.php%3Ftid%3D1937&sl=zh-CN&tl=en&hl=&ie=UTF-8

The deauthenticataion is because of the client does not support group rekey properly and it will happen to only specific wifi clients.

The resolution would be to add the following in /etc/config/wireless

option 'wpa_group_rekey' '0 ' 

One of my client is wgt634u with old atheros AR5212 chipset and I get deauthenticataion messages in the log all the time after I upgraded my router to backfire RC5. After the change mentioned above, i.e. disable wap group rekey, its all good for half an hour. Will leave it for a while and report back.

The interesting thing is that the guy in that link mentioned that the original tp-link firmware also disabled wpa group rekey. If by disabling wpa_group_rekey works should it be set to default?

comment:21 Changed 7 years ago by Paul Geraedts <p.f.j.geraedts@…>

@nbd: At the moment I'm only able to test with the Linksys WRT54GL, so I'm not able to test any ath9k related fixes. Anyhow, as far as I've observed the disconnect behavior seems driver independent. I've compiled & flashed OpenWrt r27687 on the Linksys device and the disconnect behavior of the Realtek station seems to be the same. (I've not observed any differences in behavior.)

@sam: That's interesting to know. If you ask me, that wireless option can only serve as a temporary workaround at best.

@suspend: Thanks for the info. Also interesting to know. My gut feeling is that increasing the (first) GTK rekey timeout could well turn out to be a (not too ugly) workaround. Something like increasing it from 100ms to 1s as the later 3 tries. Should not affect other (properly) coded stations too much I'd say. Very little time at the moment, but hope to test this change soon.

BTW, max_listen_int seems to change other behavior then the before mentioned GTK rekey timeout. Not the option you're looking for I think..

comment:22 Changed 7 years ago by nbd

please try copying http://nbd.name/770-group_key_timeout.patch to package/hostapd/patches and test with that. This sets the timeout for GTK rekey packets to 1s instead of 100ms, as suggested.

comment:23 Changed 7 years ago by sam.right@…

Yesterday over 16 hours period (00:00 till 16:00) the "deauthenticated due to local deauth request" appeared in the log for 1162 times and since I disabled the wpa group rekey (its over 17 hours now) there is not a single "deauthenticated due to local deauth request" entry in the log.

Even I would agree that disabling wpa group rekey may be a temporary workaround but it does pinpoint where the problem may lie and hopefully nbd can get it fixed.

My setup:

wifi client:

wgt634u running in client wds mode

Openwrt Backfire 10.03:
wpad-mini - 20100309-1
kmod-madwifi - 2.6.32.10+r3314-4
kernel - 2.6.32.10-1
busybox - 1.15.3-2

wifi AP:

TL-WR841ND running Openwrt Backfire 10.03.1-RC5, r27608

Here is the sample log when wpa group rekey is enabled

Jul 19 15:56:47 firewall hostapd: wlan0: STA [MAC ADDRESS OF THE CLIENT] IEEE 802.11: authenticated
Jul 19 15:56:48 firewall hostapd: wlan0: STA [MAC ADDRESS OF THE CLIENT] IEEE 802.11: associated (aid 1)
Jul 19 15:56:51 firewall hostapd: wlan0: STA [MAC ADDRESS OF THE CLIENT] IEEE 802.11: deauthenticated due to local deauth request
Jul 19 15:56:59 firewall hostapd: wlan0: STA [MAC ADDRESS OF THE CLIENT] IEEE 802.11: authenticated
Jul 19 15:57:00 firewall hostapd: wlan0: STA [MAC ADDRESS OF THE CLIENT] IEEE 802.11: associated (aid 1)
Jul 19 15:57:03 firewall hostapd: wlan0: STA [MAC ADDRESS OF THE CLIENT] IEEE 802.11: deauthenticated due to local deauth request
Jul 19 15:57:11 firewall hostapd: wlan0: STA [MAC ADDRESS OF THE CLIENT] IEEE 802.11: authenticated
Jul 19 15:57:12 firewall hostapd: wlan0: STA [MAC ADDRESS OF THE CLIENT] IEEE 802.11: associated (aid 1)
Jul 19 15:57:15 firewall hostapd: wlan0: STA [MAC ADDRESS OF THE CLIENT] IEEE 802.11: deauthenticated due to local deauth request

comment:24 Changed 7 years ago by nbd

it doesn't surprise me that your wgt634u is having trouble with the group rekey - i think there are some madwifi bugs that trigger this.
maybe you could try ath5k on the client side to see if it's more reliable, but only after upgrading to 10.03.1-rc5 there.

comment:25 Changed 7 years ago by sam.right@…

@nbd

Should hostapd be smart enough about wpa group rekey, i.e. if it finds that a particular client is not able to handle group rekey then its probably better to disable wpa group rekey for that particular client.

Will try the latest 10.03.1-rc5 with ath5k later this week.

comment:26 Changed 7 years ago by nbd

the point of wpa group rekey process is to allow the AP to change its group key.

the group key is global for the entire BSS, so skipping it for only one client would mean breaking multicast packet decryption for that client.

comment:27 Changed 7 years ago by AndreZ

I tried option 'wpa_group_rekey' '0 ' in my case (see /ticket/9561.html), but it didn't help.

Andre

comment:28 Changed 7 years ago by sam.right@…

@AndreZ

Did you restart the wifi (or reboot the router) after the changes?

comment:29 Changed 7 years ago by AndreZ

Yes, I did wifi down; wifi up.

Andre

comment:30 Changed 7 years ago by Paul Geraedts <p.f.j.geraedts@…>

@nbd: Thanks for the patch. Sorry, gut feeling was wrong. I've recompiled OpenWrt r27687, but now including 770-group_key_timeout.patch and flashed the Linksys WRT54GL with it. No observable changes in disconnect behavior..

Actually, I can't remember having issues with earlier Kamikaze releases. Maybe I should do something like a git-bisect? Any ideas on that?

comment:31 Changed 7 years ago by nbd

It would probably be useful to test hostapd and the driver separately. hostapd has fewer revisions, so you could just check the log there and try a few older revisions of it first, keeping the rest at the latest version

comment:32 Changed 7 years ago by Paul Geraedts <p.f.j.geraedts@…>

I think it's time for another update..

Some details about the test setup:

  • tests are carried out under loaded conditions (continuously streaming video);
  • added 'option wpa_group_rekey 60' to the 'wireless' UCI file to quickly test the equivalent behavior of several hours of normal operation.

First tested stable releases:

  • OpenWrt 10.03 upto and including 10.03.1-rc4: no disconnects*
  • OpenWrt 10.03.1-rc5: disconnects
  • Have to mention here that I did not observe any disconnects this time, but I know there are still some (at least in rc4; see my initial post). Their rate of occurrence is *much* lower though..

In summary: the patch that's triggering the disconnects is somewhere in between r24045 (rc4) and r27608 (rc5). Or actually in between r24045 (rc4) and r27153 (see my initial post).

Then tested some combinations with latest trunk:

Conclusion: the disconnect are not triggered by hostapd, but by something else. Sane conclusion?

I'm now suspecting mac80211 as all my 3 APs show the same disconnect behavior and all have different mac80211-based drivers: b43, ath5k and ath9k.. I'm thinking about leaving the core at r24045 and apply a sort of binary search algorithm to the mac80211 patches in between r24045 and r27153. I would assume equal likelihood, but any additional information is welcome here..

Is this a sane test strategy, or should I not limit the search to mac80211?

Thanks, Paul

comment:33 Changed 7 years ago by nbd

I think that's a good test strategy. Once you've narrowed it down, I will do some more code review.

comment:34 Changed 7 years ago by Paul Geraedts <p.f.j.geraedts@…>

I've found three distinguishable behaviors:

  • before r26795: no disconnects (let alone the infrequent disconnect as reported in my initial post);
  • between r26795 and X: disconnects after about the 5th GTK rekeying;
  • after X: disconnects at the first GTK rekeying;

with X somewhere in between r26795 and r27153. I propose to first focus on r26795 and #9327 though (read: I'm tired of testing for the moment : )

comment:35 Changed 7 years ago by nbd

Thanks, so we should probably also do a test run with the latest version and 420-mac80211_ignore_invalid_ccmp_rx_pn.patch removed.

comment:36 Changed 7 years ago by Paul Geraedts <p.f.j.geraedts@…>

Good point! (I've regained some sanity due to food and sleep.. : )

OpenWrt r27760, excluding 420-mac80211_ignore_invalid_ccmp_rx_pn.patch does not disconnect after the equivalence of several hours of normal operation.

I only had a very brief look, but could the behavior I observed in rc4 (see my original post) and #9327 be somehow related?

comment:37 Changed 7 years ago by Paul Geraedts <p.f.j.geraedts@…>

@suspend: please first have a look at this wiki page on building OpenWrt.

You could have a look if you also get rid of the disconnects without the patch above.

After checking out the latest trunk with svn, remove the specific patch:

rm trunk/package/mac80211/patches/420-mac80211_ignore_invalid_ccmp_rx_pn.patch

Then build the thing with

make menuconfig (and choose your preferences)
make

In the bin directory you can now find the images to flash..

Good luck..

comment:38 Changed 7 years ago by nbd

Paul, thanks for your thorough testing - without it I would not have suspected this particular patch.

#9327 describes a platform specific issue on cns3xxx, for which this patch was added as a workaround. This issue was not observed on any other platform so far, and the underlying issue was probably fixed with the kernel update in trunk, so I will simply remove that patch in trunk and backfire.

comment:39 Changed 7 years ago by Paul Geraedts <p.f.j.geraedts@…>

Your welcome, I'll test for another week to see if all connections drops are now out (also the 'rc4' ones). After that I will switch to the Atheros AR9287 card.

BTW, did you take into account that you updated the original 420 patch in r27552? Still OK to delete it?

comment:40 Changed 7 years ago by nbd

Yes, it's still OK to delete it, and in fact I've already deleted it in SVN today.

comment:41 Changed 7 years ago by anonymous

Performance is great now. Fixed also the Problems with TimeMaschine Backups killing the connectivity. Furthermore there seems to be a massive Speedup. Great.

comment:42 Changed 7 years ago by nbd

  • Resolution set to fixed
  • Status changed from assigned to closed

comment:43 Changed 7 years ago by anonymous

I'm still seeing the same issue after installing trunk (r27787).

Andre

comment:44 Changed 7 years ago by nbd

Andre, what kind of client device are you using, and what's your wifi configuration like?

comment:45 Changed 7 years ago by Paul Geraedts <p.f.j.geraedts@…>

I think the origin of the specific behavior that you observed (and still observe) is significantly different from what is reported in this ticket. In other words: I think your original ticket (#9561) should not have been marked as duplicate..

comment:46 Changed 7 years ago by anonymous

I'm using a Vivotek wifi camera IP7134.

Andre

comment:47 Changed 7 years ago by AndreZ

The config is pretty standard:

config 'wifi-device' 'radio0'

option 'type' 'mac80211'
option 'channel' '11'
option 'macaddr' '94:0c:6d:xx:xx:xx'
list 'ht_capab' 'SHORT-GI-40'
list 'ht_capab' 'DSSS_CCK-40'
option 'txpower' '27'
option 'country' 'US'
option 'disabled' '0'
option 'hwmode' '11ng'
option 'htmode' 'HT20'

config 'wifi-iface'

option 'device' 'radio0'
option 'network' 'lan'
option 'mode' 'ap'
option 'ssid' 'blabla'
option 'key' '*'
option 'encryption' 'psk2'

comment:48 Changed 7 years ago by AndreZ

It's Aplha Networks WMP-G11 Mini PCI board with Ralink RT2561/RT5225 dual band B/G chip.

Andre

comment:49 Changed 6 years ago by Paul Geraedts <p.f.j.geraedts@…>

As mentioned I would do some final testing with trunk excluding the 420 patch.

I found that nearly all connection drops are solved now. Only under heavy connection conditions I sometimes trigger a disconnect, i.e. it mostly happens when multiple stations connect / disconnect at the same time. Another interesting thing was that the MSI netbook (see original post for details) still triggered multiple 'WPA: received EAPOL-Key 2/2 Group with unexpected replay counter' log messages. This happened with the original Realtek card and later also with the replacement Atheros card. The interesting bit is that they were exclusively triggered while the netbook was on battery power!

This weekend I went back to the Realtek card to see if the updated mac80211 in r27958 has changed something. And yes it does: all these 'WPA: received EAPOL-Key 2/2 Group with unexpected replay counter' messages are out now! I'm left with a very infrequent 'WPA: received EAPOL-Key 2/4 Pairwise with unexpected replay counter'.

Possibly also interesting for #9899?

comment:50 Changed 6 years ago by AndreZ

I'm on r27963 and it's still the same. I cannot exclude patch 420 as I'm not building myself.

Andre

Changed 6 years ago by AndreZ

A new log.

comment:51 Changed 6 years ago by Paul Geraedts <p.f.j.geraedts@…>

OK, now it makes sense. The updated mac80211 has not fixed the 'WPA: received EAPOL-Key 2/2 Group with unexpected replay counter' behavior, but the patch in r27822 has (I missed that one being applied). So it's all about too small timeouts: stations simply do not get the time to properly reply. The same seems to apply to the 'WPA: received EAPOL-Key 2/4 Pairwise with unexpected replay counter' behavior.

@nbd: I think it is a good idea to also increase the first timeouts of the 4-way handshake from 100ms to 1000ms (the same as the later timeouts) and backport it, just like you did for the group rekeying. I'm not sure why these were set so tight in the first place in hostapd (hostile environments maybe?). IMO those false positive replay attack warnings are a bad thing though.

P.S. The part in my earlier post about 'stations only send all-zero nonces' is rubbish of course. The confusion was caused by hostapd logging the (nonexistent, so all-zero) nonce sent in the 4th message of the 4-way handshake (not sure why). The real SNonce sent in the 2nd message is non-zero..

comment:52 Changed 6 years ago by nbd

I'm pretty sure the initial 4-way handshake timeout needs to be at 100ms for standard compliance reasons.

comment:53 Changed 6 years ago by Paul Geraedts <p.f.j.geraedts@…>

Ok, I did not know that.

Another detailed look at a typical hostapd log increased my understanding a bit more. Loosening the tight 4-way handshake and group rekeying timeouts a bit would definitely help, but there is more to it: the 'unexpected replay counter' messages are strictly speaking just wrong.

When hostapd times out a message with replay counter X and asks for one with replay counter X+1, it should ignore any (late) message with replay counter X and wait for the one with replay counter X+1.

Currently hostapd accepts X and files X+1 as unexpected, which it isn't (hostapd just asked for it). Maybe it's implemented like this to to speed up things, not sure. IMO speeding up the transaction *reliably* would require ignoring message X, but at the same time increasing the timeout a little (a timeout of as little as 200-300ms would make a big difference here I think).

comment:54 Changed 6 years ago by Paul Geraedts <p.f.j.geraedts@…>

Ok, I just found out that *every* message sent by hostapd contains updated replay counter data to synchronize with (not only the first one). So the other reliable approach would be to accept either X or X+1 *as long as the other is being discarded* (XOR). In that way all the initial timeouts (both 4-way and group) can be kept at 100ms while no packets will be received containing an 'unexpected replay counter'.

This seems to be the approach of the current implementation of hostapd, albeit that it currently receives both instead of receiving one and discarding the other..

N.B. In total 4 timeouts can occur, so the general description is an XOR of X, X+1, X+2 and X+3.

comment:55 Changed 6 years ago by Paul Geraedts <p.f.j.geraedts@…>

I just posted a summary of these last remaining inconveniences on the hostapd mailing list:

http://lists.shmoo.com/pipermail/hostap/2011-August/023813.html

comment:56 Changed 6 years ago by Kyle

does this issue fixed now on the latest trunk?

Kyle

comment:57 Changed 6 years ago by Kyle

I encounter this issue as follow

client:win7+Netgare RangeMax Dual Band Wireless-N USB Adapter
Ap:Atheros ar9283 mini pcie

I encountered a ping loss issue on AES encryption after WPA2-PSK group key update every time,but no ping loss on WPA2-PSK/TKIP encryption on group key update.And saw "deauthenticated due to local deauth request" if I use winXP+intel internal wifi client.

Hostapd config file:
ctrl_interface=/var/run/hostapd-ath200
driver=atheros
interface=athxxx
#ieee80211n=1
channel=1
wpa_passphrase=12345678
auth_algs=1
wpa=2
wpa_pairwise=CCMP
#wpa_pairwise=TKIP
wpa_group_rekey=0
ssid=xxxxxx
bridge=br0

Hostapd log
root@SUPERWIFI:~# tail -500 /var/log/messages | grep hostapd
Mar 26 18:54:07 hostapd: ath200: STA 00:22:3f:07:d8:d4 IEEE 802.11: associated
Mar 26 18:54:07 hostapd: ath200: STA 00:22:3f:07:d8:d4 RADIUS: starting accounting session 4F704A6F-00000000
Mar 26 18:54:07 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: pairwise key handshake completed (RSN)
Mar 26 18:54:32 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: group key handshake completed (RSN)
Mar 26 18:55:33 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: group key handshake completed (RSN)
Mar 26 18:55:33 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: received EAPOL-Key 2/2 Group with unexpected replay counter
Mar 26 18:55:33 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: received EAPOL-Key 2/2 Group with unexpected replay counter
Mar 26 18:55:54 hostapd: ath200: STA 00:22:3f:07:d8:d4 IEEE 802.11: disassociated
Mar 26 18:55:58 hostapd: ath200: STA 00:22:3f:07:d8:d4 IEEE 802.11: associated
Mar 26 18:55:58 hostapd: ath200: STA 00:22:3f:07:d8:d4 RADIUS: starting accounting session 4F704A6F-00000001
Mar 26 18:55:58 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: pairwise key handshake completed (RSN)
Mar 26 18:56:31 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: group key handshake completed (RSN)
Mar 26 18:56:31 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: received EAPOL-Key 2/2 Group with unexpected replay counter
Mar 26 18:57:31 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: group key handshake completed (RSN)
Mar 26 18:58:31 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: group key handshake completed (RSN)
Mar 26 18:59:32 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: group key handshake completed (RSN)
Mar 26 19:00:32 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: group key handshake completed (RSN)
Mar 26 19:00:32 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: received EAPOL-Key 2/2 Group with unexpected replay counter
Mar 26 19:01:31 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: group key handshake completed (RSN)
Mar 26 19:02:31 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: group key handshake completed (RSN)
Mar 26 19:03:32 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: group key handshake completed (RSN)
Mar 26 19:03:32 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: received EAPOL-Key 2/2 Group with unexpected replay counter
Mar 26 19:04:32 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: group key handshake completed (RSN)
Mar 26 19:05:33 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: group key handshake completed (RSN)
Mar 26 19:05:33 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: received EAPOL-Key 2/2 Group with unexpected replay counter
Mar 26 19:05:33 hostapd: ath200: STA 00:22:3f:07:d8:d4 WPA: received EAPOL-Key 2/2 Group with unexpected replay counter

comment:58 Changed 4 years ago by jow

  • Milestone changed from Attitude Adjustment 12.09 to Barrier Breaker 14.07

Milestone Attitude Adjustment 12.09 deleted

Add Comment

Modify Ticket

Action
as closed .
The resolution will be deleted. Next status will be 'reopened'.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.