Re: [Xen-API] Recovery of lost NFS VHD storage

George Shuklin Tue, 16 Apr 2013 15:16:21 -0700

The simplest way is reboot all hosts in pool.

If you don't want to do this, there is very slow and annoying way to fixproblem by hand without global reboot.

You need to kill all domains (xc.domain_shutdown()), and then kill everytapdisk process related to VHD files from NFS. This gonna be very hard,because of lagging NFS, but finally they all die. After that you canplug/unplug every pbd for damaged SR and reset locks by/opt/xensource/sm/ resetvdis.py


If this is too hard to do - reboot'em all.

PS Most brutal way to reboot host is execute command:

echo b >/proc/sysrq-trigger

no sync, no grace, no mercy, no shame, no delay. Just reboot.


On 15.04.2013 19:03, Michael Vistein wrote:

Hi,
this morning I encountered a problem with XCP 1.6 in relation with aNFS storage repository.
We are using a hardware pool with three identical servers, all areaccessing a shared NFS VHD stroage repository on an external NFSserver. This morning the NFS server crashed, therefore all VMs losttheir hard drive and were more or less hanging.
What is the official recovery method in this case? XenCenter stillshowed the SR as “connected”, but a rescan of the SR failed. Directlyon the console of the XCP I could not cd into the mountpoint due to“Stale NFS handle”. I wasn’t able to unmount or remount the SR becauseof open files from the still running VMs. Shutting down or migratingVMs of course wasn’t possible either.
The only solution I found was a hard reboot of all servers in thepool. Is there a better way for such a problem?
Thanks in advance,

Michael



_______________________________________________
Xen-api mailing list
[email protected]
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api

_______________________________________________
Xen-api mailing list
[email protected]
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api

Re: [Xen-API] Recovery of lost NFS VHD storage

Reply via email to