2011-10-14 19:33, Tim Cook пишет:
With clustered VMFS on shared storage, VMWare can
migrate VMs faster - it knows not to copy the HDD image
file in vain - it will be equally available to the "new host"
at the correct point in migration, just as it was accessible
to the "old host".
Again. NFS/iscsi/IB = ok.
True, except that this is not an optimal solution in this described
usecase - a farm of server blades with a relatively dumb fast raw
storage (but NOT an intellectual storage server).
The idea is you would dedicate one of the servers in the chassis to be
a Solaris system, which then presents NFS out to the rest of the
hosts. From the chassis itself you would present every drive that
isn't being used to boot an existing server to this solaris host as
individual disks, and let that server take care of RAID and presenting
out the storage to the rests of the vmware hosts.
Yes, I wrote of that as an option - but a relatively poor one
(though now we're limited to do this). As I numerously
wrote, major downsides are:
* probably increased latency due to another added hop
of processing delays, just as with extra switches and
routers in networks;
* probably reduced bandwidth of LAN as compared to
direct disk access; certainly it won't get increased ;)
Besides, the LAN may be (highly) utilized by servers
running in VMs or physical blades, so storage traffic
over LAN would compete with real networking and/or
add to latencies.
* in order for the whole chassis to provide HA services
and run highly-available VMs, the storage servers have
to be redundant - at least one other blade would have
to be provisioned for failover ZFS import and serving
for other nodes.
This is not exactly a showstopper - but the "spare" blade
would either have to not run VMs at all, or run not as many
VMs as others, and in case of a pool failover event it would
probably have to migrate its running VMs away in order to
increase ARC and reduce storage latency for other servers.
That's doable, and automatable, but a hassle nonetheless.
Also I'm not certain how well other hosts can benefit from
caching in their local RAMs when using NFS or iSCSI
resources. I think they might benefit better from local
ARCs in the pool were directly imported to each of them...
* this already works, and reliably, as any other ZFS NAS
solution. That's a certain "plus" :)
In this current case one or two out of six blades should be
dedicated to storage, leaving only 4 or 5 to VMs.
In case of shared pools, there is a new problem of
TXG-master failover to some other node (which would
probably be not slower than a pool reimport is now), but
otherwise all six servers' loads are balanced. And they
only cache what they really need. And they have faster
disk access times. And they don't use LAN superfluously
for storage access.
PS: Anyway, I wanted to say this earlier - thanks to everyone
who responded, even (or especially) with criticism and
requests for detalisation. If nothing else, you helped me
describe my idea better and less ambigously, so that
some other thinkers can decide whether and how to
implement it ;)
PPS: When I earlier asked about getting ZFS under the
hood of RAID controllers, I guess I kinda wished to
replace the black box of intel's firmware with a ZFS-aware
OS (FreeBSD probably) - the storage controller modules
must be some sort of computers running in a failover link...
These SCMs would then export datasets as SAS LUNs
to specific servers, like is done now, and possibly would
not require clustered ZFS - but might benefit from it too.
So my MFSYS illustration is partially relevant for that
question as well...
zfs-discuss mailing list