Problem: Macintosh computers running Mac OS X 10.9.x (Mavericks) seems to have random or intermittent network drops on the Rice campus network. Most Likely Cause: Apple's implementation of ARP Unicast Caching and Cisco's implementation of GLBP (gateway load-balancing protocol). NOTE: There are also reports of Fall 2013 Macbook Pro 13" models having wireless problems separate from this issue. Another solution for that particular problem may help if the GLBP patch listed here does not solve your problem. DESCRIPTION: GLBP is an enabled feature on the Rice network infrastructure. Mavericks discovers and caches the actual gateway used in the route instead of the virtual gateway presented by Cisco's network gear. Apple regards this "problem" as the proper implementation of the standard, however, despite which vendor entity bears responsbility, the problem remains and is quite pronounced. The effect of waiting on this time-out to occur manifests itself as extremely poor network performance and consistency on both wired and wireless networks on the Rice network. SOLUTION: DETAILS: Cisco regards this as a client issue though Apple's configuration is within the defined standard (see section 2.3.2.1 of RFC1122), however Apple seems to be alone in their observance of the standard. The solution is to manually set the kernel state to time-out after 0 attempts (i.e., in practical terms, disabling the unicast ARP caching). The default value is 5 attempts. sudo sysctl -w net.link.ether.inet.arp_unicast_lim=0 The command to set the kernel state does not survive a reboot, so it must be written to a configuration file in the underbelly of the OS. sudo echo "sysctl -w net.link.ether.inet.arp_unicast_lim=0" > /etc/sysctl.conf It is also recommended to set the owner and group to root and wheel with proper security permission on the file. sudo chown root:wheel /etc/sysctl.conf sudo chmod 644 /etc/sysctl.conf To restore the default settings, remove the /etc/sysctl.conf file and re-issue the command to adjust the kernel state to time-out after 5 attempts: sudo sysctl -w net.link.ether.inet.arp_unicast_lim=5
QUICK SOLUTION Because this is a bit esoteric for the average user, here is a script that will automate the process. How to apply: 1) Download the attached file: arp_glbp_patch.2.0.command.zip 2) Double-click the .ZIP file to unpack the payload (if your browser did not do so automatically). 3) Control+Click (or right click) the file called arp_glbp_patch.2.0.command 4) Select "OPEN" from the pop-up contextual menu. 5) Because the code is not signed by an Apple developer (the default security setting), you'll be prompted with a security warning. Select "OPEN" at this time. 6) This will open the Terminal.app and execute the script. 7) At the prompt, enter your local password (i.e., whichever password you use to log into the computer or unlock the screensaver). As long as your account is set as a local administrator, this should proceed. If not, contact the helpdesk for assistance. This script apply the commands as listed in the details above to permanently disable ARP unicast caching. If the patch is already enabled (the /etc/sysctl.conf file is present), it will ask if you'd like to remove it and reset the system to defaults. Here are the general actions of the script: Check for the proper permissions to run as "super user" (su) Confirm the operating system as being "Mavericks" (10.9) Check to see if the patch has already been installed or configured IF the patch is already present, the script will present an option to remove the patch and return the system to defaults IF the patch for the kernel state to disable the caching is not present, the script will automate the changes needed. Any actions taken by the script are enumerated in the Terminal. The script is as follows: #!/bin/sh #ARP Patch # Is this user a local admin on the Mac? GROUP=$(dsmemberutil checkmembership -U "$USER" -G "admin") # Checking for membership to group admin for sudo... string=$GROUP substr=not case $string in *"$substr"*) echo "You must be a local administrator to run this script." exit 0;; *) echo "Please enter your local computer account password..." ;; esac if [[ $(sw_vers -productVersion | grep '10.9') ]] then if [[ -f /etc/sysctl.conf ]] then if grep 'unicast' /etc/sysctl.conf > /dev/null 2>&1 then echo "PATCH WAS PREVIOUSLY ENABLED" echo "Would you like to remove the patch and restore defaults? [ Y or N ]" read REMOVE if [ "$REMOVE" = "y" ]; then sudo sysctl -w net.link.ether.inet.arp_unicast_lim=5 > /dev/null 2>&1 sudo rm /etc/sysctl.conf > /dev/null 2>&1 echo "Patch removed. Default settings restored." exit fi if [ "$REMOVE" != "y" ]; then echo "Exiting..." exit 0 fi fi fi sudo sysctl -w net.link.ether.inet.arp_unicast_lim=0 > /dev/null 2>&1 echo "net.link.ether.inet.arp_unicast_lim=0" | sudo tee -a /etc/sysctl.conf > /dev/null 2>&1 sudo chown root:wheel /etc/sysctl.conf sudo chmod 644 /etc/sysctl.conf echo "PATCH ENABLED" fi exit 0 <div>-------- Original message --------</div><div>From: "Ashfield, Matt (NBCC)" <[email protected]> </div><div>Date:25/09/2014 17:34 (GMT-06:00) </div><div>To: [email protected] </div><div>Subject: Re: [WIRELESS-LAN] Apple devices dropping on WPA2-PSK and WPA2-Ent SSIDs Aruba 6.3 </div><div> </div>ARP cache bug? Will have to dig into that one. Jeff : if you've turned off band steering have you done any other configuring to push devices to 5ghz? What about CCKM? Not sure if Macs would play well with that either? Sent from my BlackBerry 10 smartphone on the Bell network. From: Danny Eaton Sent: Thursday, September 25, 2014 7:25 PM To: [email protected] Reply To: The EDUCAUSE Wireless Issues Constituent Group Listserv Subject: Re: [WIRELESS-LAN] Apple devices dropping on WPA2-PSK and WPA2-Ent SSIDs Aruba 6.3 We saw a lot of the same. The ARP cache bug (since we run GLBP on the gateways) has killed us too. -------- Original message -------- From: Jeffrey Sessler Date:25/09/2014 16:40 (GMT-06:00) To: [email protected] Subject: Re: [WIRELESS-LAN] Apple devices dropping on WPA2-PSK and WPA2-Ent SSIDs Aruba 6.3 We noticed that our WLAN with band/load-steering enabled had a high report rate of Macintosh connectivity issues, and the WLAN that did not was trouble free. I suspect what was happening was this: Mac would initially associate (Ent-WPA2), then the controller would force it to move to another band and/or AP. It's at this point (a roam) that the Apple certificate issue would kick in, and it was hit or miss as to the Mac re-associating or failing. This was especially problematic when a Mac client was equidistant from two AP's. Turning off band/load steering pretty much eliminated the bulk of the connectivity issues, and trusting the certificate solved the rest. Band/load steering is just problematic because you can never predict how a client will react to it. Jeff >>> On Wednesday, September 24, 2014 at 5:07 PM, in message >>> <9b14e007db035b49b466f094e5a6ed3649346...@mailmb04.ad.adelaide.edu.au>, >>> Jason Cook <[email protected]> wrote: Cisco here but we have had plenty of issues with Mac OS. Spent some time with TAC recently seeing what we can do about it with no real fix. Our EAP timers had gotten a bit out of whack, and adjusting them made improvements for some clients, but ultimately OSX clients just don’t seem to like roaming. Though we have seen rather large differences between devices. So a 2014 Macbook Pro and an Air, both running 10.9.4, both with the same model Broadcom card had different results. The Air continues to lost connectivity for 10+ seconds sometimes requiring intervention to get it back, while the pro was typically 4 seconds or less. Sometimes the Air is authenticating, others it’s waiting for DHCP…. Or both For a stationary client, we have seen this issue occur when a client sits between 2 AP’s and get a pretty similar signal from both. As signal fluctuates, the client jumps AP and the above happens. Note I don’t see “Ptk Challenge Failed” in our logs. -- Jason Cook The University of Adelaide, AUSTRALIA 5005 Ph : +61 8 8313 4800 e-mail: [email protected]<mailto:[email protected]<mailto:[email protected]%3cmailto:[email protected]>> From: The EDUCAUSE Wireless Issues Constituent Group Listserv [mailto:[email protected]] On Behalf Of Derek Johnson Sent: Thursday, 25 September 2014 1:53 AM To: [email protected] Subject: Re: [WIRELESS-LAN] Apple devices dropping on WPA2-PSK and WPA2-Ent SSIDs Aruba 6.3 Likewise, I see the same "Ptk Challenge Failed" errors show up in logs. Sometimes I've seen it when a client's having temporary issues, other times I'll see it when a client is roaming rapidly. As an example, when someone is walking across campus with a smartphone in their pocket (which never happens..... cough) and it's trying to connect to APs as it moves along. It may move out of range of the AP before the key exchange completes, and I'll see this error. When I spoke with Aruba support about these issues, they didn't seem concerned, though I never could get a straight answer why it would happen with a stationary client. I'd be very interested to hear what you learn about it. :) FWIW, I'm running AOS 6.3.1.11 with AP-225s here. OKC disabled, PMKID enabled. Derek Johnson | Data Communications Coordinator FORT HAYS STATE UNIVERSITY 415 Lyman Dr. TH 101, Hays, KS 67601 (785) 628 - 5688 | [email protected]<mailto:[email protected]> From: "Wang, Yu" <[email protected]<mailto:[email protected]>> To: [email protected]<mailto:[email protected]> Date: 09/24/2014 10:19 AM Subject: Re: [WIRELESS-LAN] Apple devices dropping on WPA2-PSK and WPA2-Ent SSIDs Aruba 6.3 Sent by: The EDUCAUSE Wireless Issues Constituent Group Listserv <[email protected]<mailto:[email protected]>> ________________________________ I echo what Ryan described here. Ryan alerted me of this issue and after changing user logging level to notification on our Aruba controllers, we got quite a number of “Ptk Challenge Failed” in our logs. We have both OKC and Validate PMKID enabled and have not changed any of the settings as I saw Aruba engineers gave conflict statements. Yu Wang ____________________________ Network Architect Information Technology Services The Florida State University 850-645-6810 [email protected]<mailto:[email protected]> From: The EDUCAUSE Wireless Issues Constituent Group Listserv [mailto:[email protected]] On Behalf Of Turner, Ryan H Sent: Wednesday, September 24, 2014 10:29 AM To: [email protected]<mailto:[email protected]> Subject: [WIRELESS-LAN] Apple devices dropping on WPA2-PSK and WPA2-Ent SSIDs Aruba 6.3 We’ve had complaints for a while that would come in sporadically, but didn’t pay them much mind as it was always difficult to reproduce. The complaint was with Apple devices (normally OSX) that would just drop connectivity and then reestablish moments later. People would complain that our secure SSID (our primary EAP-TLS WPA2-Ent SSID) was not stable. It was always from Apple users. Recently, however, one of our employees with an Apple running OSX (Yosemite) started to have the problem routinely on our PSK SSID. When I turned on debugging in the logs, the following message was logged every time he dropped: Sep 5 10:53:48 :501105: <NOTI> |AP [email protected]<mailto:[email protected]> stm| Deauth from sta: 48:d7:05:bf:28:e5: AP 172.28.65..99-00:1a:1e:52:dd:51-RB_House_016 Reason Ptk Challenge Failed When I did a google the Ptk Challenge failed, it turned up an Airheads forum that said that since OSX devices don’t support Opportunistic Key Caching, having this enabled on your controllers could cause drops on these devices when they roam from AP to AP. We disabled it on both out UNC-Secure and UNC-PSK SSIDs, and yet the user is still having disconnects, and we still see this message when his device drops. We actually see a LOT of these messages in the logs now that I have turned on the proper notification logging, indicating that this error message is either a red herring, or a lot more prevalent in our environment that we had hoped for. I plan on opening a case with Aruba, but before I beat my head against a wall for the next couple of hours with a support engineer, have any of you seen this problem and tackled it? Ryan H Turner Senior Network Engineer The University of North Carolina at Chapel Hill CB 1150 Chapel Hill, NC 27599 +1 919 445 0113 Office +1 919 274 7926 Mobile ********** Participation and subscription information for this EDUCAUSE Constituent Group discussion list can be found at http://www.educause.edu/groups/. ********** Participation and subscription information for this EDUCAUSE Constituent Group discussion list can be found at http://www.educause.edu/groups/. ********** Participation and subscription information for this EDUCAUSE Constituent Group discussion list can be found at http://www.educause.edu/groups/. ********** Participation and subscription information for this EDUCAUSE Constituent Group discussion list can be found at http://www.educause.edu/groups/. ********** Participation and subscription information for this EDUCAUSE Constituent Group discussion list can be found at http://www.educause.edu/groups/. ********** Participation and subscription information for this EDUCAUSE Constituent Group discussion list can be found at http://www.educause.edu/groups/. !DSPAM:911,54249874232407980320490!
