> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Tim Cook > > In my example - probably not a completely clustered FS. > A clustered ZFS pool with datasets individually owned by > specific nodes at any given time would suffice for such > VM farms. This would give users the benefits of ZFS > (resilience, snapshots and clones, shared free space) > merged with the speed of direct disk access instead of > lagging through a storage server accessing these disks.
I think I see a couple of points of disconnect. #1 - You seem to be assuming storage is slower when it's on a remote storage server as opposed to a local disk. While this is typically true over ethernet, it's not necessarily true over infiniband or fibre channel. That being said, I don't want to assume everyone should be shoe-horned into infiniband or fibre channel. There are some significant downsides of IB and FC. Such as cost, and centralization of the storage. Single point of failure, and so on. So there is some ground to be gained... Saving cost and/or increasing workload distribution and/or scalability. One size doesn't fit all. I like the fact that you're thinking of something different. #2 - You're talking about a clustered FS, but the characteristics required are more similar to a distributed filesystem. In a clustered FS, you have something like a LUN on a SAN, which is a raw device simultaneously mounted by multiple OSes. In a distributed FS, such as lustre, you have a configurable level of redundancy (maybe zero) distributed across multiple systems (maybe all) and meanwhile all hosts share the same namespace. So each system doing heavy IO is working at local disk speeds, but any system trying to access data that was created by another system must access that data remotely. If the goal is ... to do something like VMotion, including the storage... Doing something like VMotion would be largely pointless if the VM storage still remains on the node that was previously the compute head. So let's imagine for a moment that you have two systems, which are connected directly to each other over infiniband or any bus whose remote performance is the same as local performance. You have a zpool mirror using the local disk and the remote disk. Then you should be able to (theoretically) do something like VMotion from one system to the other, and kill the original system. Even if the original system dies ungracefully and the VM dies with it, you can still boot up the VM on the second system, and the only loss you've suffered was an ungraceful reboot. If you do the same thing over ethernet, then the performance will be degraded to ethernet speeds. So take it for granted, no matter what you do, you either need a bus that performs just as well remotely versus locally... Or else performance will be degraded... Or else it's kind of pointless because the VM storage lives only on the system that you want to VMotion away from. _______________________________________________ zfs-discuss mailing list email@example.com http://mail.opensolaris.org/mailman/listinfo/zfs-discuss