On Thu, Apr 26, 2012 at 12:37 PM, Richard Elling <richard.ell...@gmail.com> wrote: > [...]
NFSv4 had migration in the protocol (excluding protocols between servers) from the get-go, but it was missing a lot (FedFS) and was not implemented until recently. I've no idea what clients and servers support it adequately besides Solaris 11, though that's just my fault (not being informed). It's taken over a decade to get to where we have any implementations of NFSv4 migration. >> For me one of the exciting things about Lustre was/is the idea that >> you could just have a single volume where all new data (and metadata) >> is distributed evenly as you go. Need more storage? Plug it in, >> either to an existing head or via a new head, then flip a switch and >> there it is. No need to manage allocation. Migration may still be >> needed, both within a cluster and between clusters, but that's much >> more manageable when you have a protocol where data locations can be >> all over the place in a completely transparent manner. > > > Many distributed file systems do this, at the cost of being not quite > POSIX-ish. Well, Lustre does POSIX semantics just fine, including cache coherency (as opposed to NFS' close-to-open coherency, which is decidedly non-POSIX). > In the brave new world of storage vmotion, nosql, and distributed object > stores, > it is not clear to me that coding to a POSIX file system is a strong > requirement. Well, I don't quite agree. I'm very suspicious of eventually-consistent. I'm not saying that the enormous DBs that eBay and such run should sport SQL and ACID semantics -- I'm saying that I think we can do much better than eventually-consistent (and no-language) while not paying the steep price that ACID requires. I'm not alone in this either. The trick is to find the right compromise. Close-to-open semantics works out fine for NFS, but O_APPEND is too wonderful not to have (ditto O_EXCL, which NFSv2 did not have; v4 has O_EXCL, but not O_APPEND). Whoever first delivers the right compromise in distributed DB semantics stands to make a fortune. > Perhaps people are so tainted by experiences with v2 and v3 that we can > explain > the non-migration to v4 as being due to poor marketing? As a leader of NFS, > Sun > had unimpressive marketing. Sun did not do too much to improve NFS in the 90s, not compared to the v4 work that only really started paying off only too recently. And then since Sun had lost the client space by then it doesn't mean all that much to have the best server if the clients aren't able to take advantage of the server's best features for lack of client implementation. Basically, Sun's ZFS, DTrace, SMF, NFSv4, Zones, and other amazing innovations came a few years too late to make up for the awful management that Sun was saddled with. But for all the decidedly awful things Sun management did (or didn't do), the worst was terminating Sun PS (yes, worse that all the non-marketing, poor marketing, poor acquisitions, poor strategy, and all the rest including truly epic mistakes like icing Solaris on x86 a decade ago). One of the worst outcomes of the Sun debacle is that now there's a bevy of senior execs who think the worst thing Sun did was to open source Solaris and Java -- which isn't to say that Sun should have open sourced as much as it did, or that open source is an end in itself, but that open sourcing these things was legitimate a business tool with very specific goals in mind in each case, and which had nothing to do with the sinking of the company. Or maybe that's one of the best outcomes, because the good news about it is that those who learn the right lessons (in that case: that open source is a legitimate business tool that is sometimes, often even, a great mind-share building tool) will be in the minority, and thus will have a huge advantage over their competition. That's another thing Sun did not learn until it was too late: mind-share matters enormously to a software company. Nico -- _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss