On 5/24/06, Peter Memishian <[EMAIL PROTECTED]> wrote:
>
>  > I find that when I do cable pulls (pull bge0, replace bge0, pull bge1,
>  > replace bge1) that a zone IP address has been transitioned to a test
>  > address.  Yuck!
>
> A test address?  It actually ends up with IFF_NOFAILOVER set?

I misstated the exact sequence, here is sanitized output from the console and an ssh session I had to the machine.  Note that I have removed all lo interfaces, bge0, bge1, bge2, bge50000, and bge50001.  bge0 and bge1 behave just fine (they also had a failover interface configured in /etc/hostname.bge0).  bge50000 and bge50001 were configured the same as bge49000 and bge490001 and had identical problems.  Hopefully this increases the signal to noise ratio.


Step 1: A fresh boot

Rebooting with command: boot                                          
Boot device: disk1:a  File and args:
SunOS Release 5.10 Version Generic_118822-25 64-bit
Copyright 1983-2005 Sun Microsystems, Inc.  All rights reserved.
Use is subject to license terms.
Hardware watchdog enabled
Hostname: myhostname
Apr 27 12:52:31 in.mpathd[150]: No test address configured on interface bge49001; disabling probe-based failure detection on it
Apr 27 12:52:31 in.mpathd[150]: No test address configured on interface bge49000; disabling probe-based failure detection on it
Apr 27 12:52:31 in.mpathd[150]: No test address configured on interface bge1; disabling probe-based failure detection on it
Apr 27 12:52:31 in.mpathd[150]: No test address configured on interface bge0; disabling probe-based failure detection on it

myhostname console login:

Step 2: Record /etc/hostname.*

Notice that there are no IP's configured in the vlan interfaces, but bge0/bge1 are configured with an IP.  As mentioned above, there are no problems with bge0 and bge1.

# /var/tmp/show-etc-hostname
==== Begin /etc/hostname.bge0 ====
group 10.62.48.0 up myhostname netmask + broadcast + up
==== End /etc/hostname.bge0 ====
==== Begin /etc/hostname.bge1 ====
group 10.62.48.0
==== End /etc/hostname.bge1 ====
==== Begin /etc/hostname.bge49000 ====
group 10.62.49.0
==== End /etc/hostname.bge49000 ====
==== Begin /etc/hostname.bge49001 ====
group 10.62.49.0
==== End /etc/hostname.bge49001 ====

Step 3: Observe ifconfig -a

Notice that booting the zones caused the failover IP's to go to virtuals.

# ifconfig -a
. . .
bge49000: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 5
        inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
        groupname 10.62.49.0
        ether 0:3:ba:99:17:17
bge49000:1: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 5
        zone lab-test-dev-01
        inet 10.62.49.101 netmask ffffff00 broadcast 10.62.49.255
bge49001: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 6
        inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
        groupname 10.62.49.0
        ether 0:3:ba:99:17:18
bge49001:1: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 6
        zone lab-test-dev-02
        inet 10.62.49.102 netmask ffffff00 broadcast 10.62.49.255

Step 4: Pull bge0

Console messages:

Apr 27 13:01:12 myhostname bge: NOTICE: bge0: link down
Apr 27 13:01:12 myhostname in.mpathd[150]: The link has gone down on bge49000
Apr 27 13:01:12 myhostname in.mpathd[150]: NIC failure detected on bge49000 of group 10.62.49.0
Apr 27 13:01:12 myhostname in.mpathd[150]: Successfully failed over from NIC bge49000 to NIC bge49001
Apr 27 13:01:12 myhostname in.mpathd[150]: The link has gone down on bge0
Apr 27 13:01:12 myhostname in.mpathd[150]: NIC failure detected on bge0 of group 10.62.48.0
Apr 27 13:01:12 myhostname in.mpathd[150]: Successfully failed over from NIC bge0 to NIC bge1

Interface configuration:

# ifconfig -a
bge49000: flags=219000802<BROADCAST,MULTICAST,IPv4,NOFAILOVER,FAILED,CoS> mtu 0 index 5
        inet 0.0.0.0 netmask 0
        groupname 10.62.49.0
        ether 0:3:ba:99:17:17
bge49001: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 6
        inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
        groupname 10.62.49.0
        ether 0:3:ba:99:17:18
bge49001:1: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 6
        zone lab-test-dev-02
        inet 10.62.49.102 netmask ffffff00 broadcast 10.62.49.255
bge49001:2: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 6
        inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
bge49001:3: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 6
        zone lab-test-dev-01
        inet 10.62.49.101 netmask ffffff00 broadcast 10.62.49.255

Step 4: Replace bge0

Console messages:

Notice that bge49001 seems to think it has a test address now.

Apr 27 13:05:36 myhostname bge: NOTICE: bge0: link up 1000Mbps Full-Duplex
Apr 27 13:05:36 myhostname in.mpathd[150]: The link has come up on bge49000
Apr 27 13:05:36 myhostname in.mpathd[150]: NIC repair detected on bge49000 of group 10.62.49.0
Apr 27 13:05:36 myhostname in.mpathd[150]: Successfully failed back to NIC bge49000
Apr 27 13:05:36 myhostname in.mpathd[150]: The link has come up on bge0
Apr 27 13:05:36 myhostname in.mpathd[150]: NIC repair detected on bge0 of group 10.62.48.0
Apr 27 13:05:36 myhostname in.mpathd[150]: Successfully failed back to NIC bge0
Apr 27 13:05:36 myhostname in.mpathd[150]: Test address 0.0.0.0 is not unique; disabling probe-based failure detection
Apr 27 13:05:36 myhostname last message repeated 1 time
Apr 27 13:05:56 myhostname in.mpathd[150]: Test address now configured on interface bge49001; enabling probe-based failure detection on it

Interface configuration:

However, there is no real IP address or IFF_NOFAILOVER flag set on bge49001.

# ifconfig -a
. . .
bge49000: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 5
        zone lab-test-dev-01
        inet 10.62.49.101 netmask ffffff00 broadcast 10.62.49.255
        groupname 10.62.49.0
        ether 0:3:ba:99:17:17
bge49001: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 6
        inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
        groupname 10.62.49.0
        ether 0:3:ba:99:17:18
bge49001:1: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 6
        zone lab-test-dev-02
        inet 10.62.49.102 netmask ffffff00 broadcast 10.62.49.255

Step 5: Pull bge1

Console messages:

Apr 27 13:10:05 myhostname bge: NOTICE: bge1: link down
Apr 27 13:10:05 myhostname in.mpathd[150]: The link has gone down on bge49001
Apr 27 13:10:05 myhostname in.mpathd[150]: NIC failure detected on bge49001 of group 10.62.49.0
Apr 27 13:10:05 myhostname in.mpathd[150]: The link has gone down on bge1
Apr 27 13:10:05 myhostname in.mpathd[150]: NIC failure detected on bge1 of group 10.62.48.0
Apr 27 13:10:05 myhostname in.mpathd[150]: Successfully failed over from NIC bge1 to NIC bge0

Interface configuration:

# ifconfig -a
. . .
bge49000: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 5
        zone lab-test-dev-01
        inet 10.62.49.101 netmask ffffff00 broadcast 10.62.49.255
        groupname 10.62.49.0
        ether 0:3:ba:99:17:17
bge49001: flags=211000803<UP,BROADCAST,MULTICAST,IPv4,FAILED,CoS> mtu 1500 index 6
        inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
        groupname 10.62.49.0
        ether 0:3:ba:99:17:18
bge49001:1: flags=211000803<UP,BROADCAST,MULTICAST,IPv4,FAILED,CoS> mtu 1500 index 6
        zone lab-test-dev-02
        inet 10.62.49.102 netmask ffffff00 broadcast 10.62.49.255

Step 6: Replace bge1

Console messages:


Apr 27 13:13:04 myhostname bge: NOTICE: bge1: link up 1000Mbps Full-Duplex
Apr 27 13:13:04 myhostname in.mpathd[150]: The link has come up on bge49001
Apr 27 13:13:04 myhostname in.mpathd[150]: The link has come up on bge1
Apr 27 13:13:04 myhostname in.mpathd[150]: NIC repair detected on bge1 of group 10.62.48.0
Apr 27 13:13:04 myhostname in.mpathd[150]: Successfully failed back to NIC bge1

Interface configuration:

Notice that bge49001 and 49001:1 are FAILED.

# ifconfig -a
. . .
bge49000: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 5
        zone lab-test-dev-01
        inet 10.62.49.101 netmask ffffff00 broadcast 10.62.49.255
        groupname 10.62.49.0
        ether 0:3:ba:99:17:17
bge49001: flags=211000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4, FAILED,CoS> mtu 1500 index 6
        inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
        groupname 10.62.49.0
        ether 0:3:ba:99:17:18
bge49001:1: flags=211000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,FAILED,CoS> mtu 1500 index 6
        zone lab-test-dev-02
        inet 10.62.49.102 netmask ffffff00 broadcast 10.62.49.255



--
Mike Gerdts
http://mgerdts.blogspot.com/
_______________________________________________
zones-discuss mailing list
zones-discuss@opensolaris.org

Reply via email to