> Two and three-node clusters with SC3.2 and S10u3
> (120011-14).
> If a node is rebooted when using SCSI3-PGR the node
> is not
> able to take the zpool by HAStoragePlus due to
> reservation conflict.
> SCSI2-PGRE is okay.
> Using the same SAN-LUN:s in a metaset (SVM) and
> HAStoragePlus
> works okay with PGR and PGRE. (both SMI and
> EFI-labled disks)
> 
> If using scshutdown and restart all nodes then it
> will work.
> Also, (interesting) If I reboot a node and then run:
> update_drv -f ssd ,
> then the node will be able to take SCSI3-PGR zpools.
> 
> Storage or Solaris/Cluster issue ?
> What is the differences between SVM and ZFS from
> the ssd point of view in this case? 

I had a similar problem with a two-node X86 cluster running s10u3 (no extra 
patches) + SC 3.2 (HA-NFS + ZFS). When I rebooted the node who owned the quorum 
device, the surviving node would panic with:

panic[cpu0]/thread=ffffffff9cec71a0: CMM: Cluster lost operational quorum; 
aborting.

fffffe80021f5b50 genunix:vcmn_err+13 ()
fffffe80021f5b60 
cl_runtime:__1cZsc_syslog_msg_log_no_args6FpviipkcpnR__va_list_element__nZsc_syslog_msg_status_enum__+24
 ()
fffffe80021f5c40 
cl_runtime:__1cCosNsc_syslog_msgDlog6MiipkcE_nZsc_syslog_msg_status_enum__+9d ()
fffffe80021f5e20 
cl_haci:__1cOautomaton_implbAstate_machine_qcheck_state6M_nVcmm_automaton_event_t__+3bc
 ()
fffffe80021f5e60 cl_haci:__1cIcmm_implStransitions_thread6M_v_+de ()
fffffe80021f5e70 cl_haci:__1cIcmm_implYtransitions_thread_start6Fpv_v_+b ()
fffffe80021f5ed0 cl_orb:cllwpwrapper+106 ()
fffffe80021f5ee0 unix:thread_start+8 ()

It appears you have SPARC clusters (based on the patch mentioned above), so 
this may not be the same problem. Once s10u4 was released, I reinstalled each 
node and failed to reproduce the problem. 

I didn't try the SVM + ZFS combo.

Rob
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to