For what it is worth, I had a TAC case open around a year ago for issues with 
Cisco wifi phones.  Some of the issues we were having were attributed to delays 
at the AP caused by security features on the switch port (port security, DHCP 
Snooping, DAI, IP Verify Source) – Cisco 3560 switches.  We didn’t test each 
feature independently as we had no issues with disabling them.  We were told 
that each AP switch port would optimally have the following configuration:

interface GigabitEthernet0/0/0
switchport access vlan XXX
switchport mode access
mls qos trust dscp
spanning-tree portfast

I’m thinking Jeff may be on the right track.  I wonder if a delay caused by 
security features might be causing the AP to assume an outage occurred and a 
new DHCP lease is needed…


[cid:[email protected]]
Jeremy Brake
Network Services Analyst, Information Technology
Angelo State University
Member, Texas Tech University System
ASU Station #11020
San Angelo, TX 76909-1020
Phone: (325) 942-2333 Fax: (325) 942-2109
[email protected]<mailto:[email protected]>




From: The EDUCAUSE Wireless Issues Constituent Group Listserv 
[mailto:[email protected]] On Behalf Of Dan Brisson
Sent: Monday, February 06, 2012 3:01 PM
To: [email protected]
Subject: Re: [WIRELESS-LAN] Cisco APs losing CAPWAP session

Jeff,

Thanks for the email.  I have an email into our DHCP folks to get them to 
extend the lease time to 24hrs - it's currently set to 8.  That being said, I 
looked in our DHCP server log and found that for an AP that lost its CAPWAP 
session, right at the time it was lost, the DHCP log shows 3 DHCP release 
messages from the AP, then 2 seconds later it shows a DHCP Discover message and 
then 1 second later the DHCP Offer, Request, and ACK.  So the span of time from 
the last DHCP release until DHCP Ack is 3 seconds according to the DHCP log.

The key here seems to be that the AP, for some reason, felt it had to release 
its IP and then go through the entire DHCP process again.  I can understand 
that throwing the CAPWAP session for a loop.  TAC has said that this is most 
likely a consequence of some sort of network disruption, rather than the actual 
cause.

All that being said, I would think the DHCP snooping database would show it 
just being renewed, but I'll check on the next one.

We do not write our database to the local flash or an FTP server.

Thanks,
-dan




Dan Brisson

Network Engineer

University of Vermont

(Ph) 802.656.8111

[email protected]<mailto:[email protected]>

On 2/6/2012 2:46 PM, Jeffrey Sessler wrote:
Dan,

If you extend the DHCP lease duration of the AP's, and reenable DAI and IP SV, 
what happens to the interval between lost CAPWAP sessions? Perhaps at the DHCP 
renewal time, DAI and IP SV introduces enough of a delay to cause a problem 
with CAPWAP. With students are gone and the activity with an AP is low, CAPWAP 
is OK - but under load/activity - CAPWAP is catching it.

When an AP does the CAPWAP "dance" what does the lease time look like in the 
snooping database? Does it look like it was just renewed?

Last but not least, are you writing your database to the local flash of the 
switch or to a FTP server? Do writes of the database correspond with the CAPWAP 
loss? What does the switch CPU look like at the time i.e. do you have ACL's 
running that are pushing work out of the ASICs and on to the CPU?

Jeff

>>> On Monday, February 06, 2012 at 6:47 AM, in message 
>>> <[email protected]><mailto:[email protected]>, Dan Brisson 
>>> <[email protected]><mailto:[email protected]> wrote:
For those following this thread, while we still haven't determined the exact 
cause of this problem, we also have not had a drop from an AP where we turned 
off DAI and IP source verify.  Seems logical that one (or both) of those could 
be causing problems, although what is not entirely logical is how that would be 
related to load/activity, since the APs never drop when the students are gone 
on break.

TAC has been very responsive and at this point I've asked if we can get someone 
from the Switching side to look at the possibility that DAI and/or IP Source 
verify could be the cause.

-dan




Dan Brisson

Network Engineer

University of Vermont

(Ph) 802.656.8111

[email protected]<mailto:[email protected]>

On 2/1/2012 2:38 PM, Dan Brisson wrote:
Ah right, yes, 'mls qos' is NOT configured on any of the 3560X switches.

We used DAI and DHCP snooping b/c the majority of APs are actually in student 
rooms due to, well, no other place really to put them.  :)  Now that you've 
brought that up, though, we had to go in the ceiling in one of our newer, 
bigger complexes.  I'm going to turn off DAI and IP Source verify there and see 
if the drops stop.

Will let folks know what I find.

Thanks!
-dan




Dan Brisson

Network Engineer

University of Vermont

(Ph) 802.656.8111

[email protected]<mailto:[email protected]>

On 2/1/2012 12:41 PM, Garry Peirce wrote:
Dan,
A small but important point to verify re: QoS.
By ‘no QoS in place’, does that mean the global  ‘mls qos’ is NOT configured on 
the resHall switches or that no specific QoS config has been configured?
Ex. if the global ‘mls qos’ is enabled, all ports (including APs) would be 
untrusted by default with all packet markings remarked as 0.
Also, any QoS/service policies on the relevant router’s interfaces?
Given the L2 functions you mention are unique to the ResHall side, I’d disable 
them on the ports used by the APs.
I wouldn’t expect these L2 security functions to be needed on known AP ports 
and removing them might provide further insight on the issue.
DAI disabling AP ports due to ARP pps threshold (odd)? Is errdisable 
auto-recovery of DAI enabled?
Any log data from the switch of an affected AP?
From: The EDUCAUSE Wireless Issues Constituent Group Listserv 
[mailto:[email protected]] On Behalf Of Dan Brisson
Sent: Wednesday, February 01, 2012 11:35 AM
To: 
[email protected]<mailto:[email protected]>
Subject: Re: [WIRELESS-LAN] Cisco APs losing CAPWAP session
Ok, thanks for validating.  It also seemed a bit strange to me and yes, I 
checked a bunch APs that haven't dropped recently and they all showed 10-12ms.

One thing that occurred to me is we are doing DHCP snooping and Dynamic Arp 
Inspection on the 3560Xs.  That is unique to this part of campus as we haven't 
yet rolled that out to the entire admin side.

Thanks,
-dan





Dan Brisson

Network Engineer

University of Vermont

(Ph) 802.656.8111

[email protected]<mailto:[email protected]>

On 2/1/2012 10:23 AM, Mike King wrote:
On Wed, Feb 1, 2012 at 10:03 AM, Dan Brisson 
<[email protected]<mailto:[email protected]>> wrote:

[cid:[email protected]]
It's been awhile since I've read these, but If I interpret this correctly,  It 
took 1m 15s for the AP to associate to the controller.  That's a very long 
time.   Mine usually have association in the 10-20 second range.
Not sure what that indicates, but it's an anomaly from my point of view.
I don't think it's a rouge herring.  (See Backhoe joke from earlier in the 
thread)
Mike
********** Participation and subscription information for this EDUCAUSE 
Constituent Group discussion list can be found at 
http://www.educause.edu/groups/.
********** Participation and subscription information for this EDUCAUSE 
Constituent Group discussion list can be found at 
http://www.educause.edu/groups/.
********** Participation and subscription information for this EDUCAUSE 
Constituent Group discussion list can be found at 
http://www.educause.edu/groups/.
********** Participation and subscription information for this EDUCAUSE 
Constituent Group discussion list can be found at 
http://www.educause.edu/groups/.

********** Participation and subscription information for this EDUCAUSE 
Constituent Group discussion list can be found at 
http://www.educause.edu/groups/.
********** Participation and subscription information for this EDUCAUSE 
Constituent Group discussion list can be found at 
http://www.educause.edu/groups/.

<<inline: image001.gif>>

<<inline: image002.png>>

Reply via email to