On Sep 14, 2007  08:52 -0600, Mark Maybee wrote:
> >Without knowing the details, it would seem at first glance that
> >having variable dnode size would be fairly complex.  Aren't the
> >dnodes just stored in a single sparse object and accessed by
> >dnode_size * objid?  This does seem desirable from the POV that
> >if you have an existing fs with the current dnode size you don't
> >want to need a reformat in order to use the larger size.
>
> I was referring here to supporting multiple dnode sizes within a
> *pool*, but the size would still remained fixed for a given dataset
> (see Bill's mail).  This is a much simpler concept to implement.

Ah, sure.  That would be a lot easier to implement.

> >That is true, and we discussed this internally, but one of the internal
> >requirements we have for DMU usage is that it create an on-disk layout
> >that matches ZFS so that it is possible to mount a Lustre filesystem
> >via ZFS or ZFS-FUSE (and potentially the reverse in the future).
> >This will allow us to do problem diagnosis and also leverage any ZFS
> >scanning/verification tools that may be developed.
>
> Ah, interesting, I was not aware of this requirement.  It would not be
> difficult to allow the ZPL to work with a larger dnode size (in fact
> its pretty much a noop as long as the ZPL is not trying to use any of
> the extra space in the dnode).

I agree, but I suspect large dnodes could also be of use to ZFS at
some point, either for fast EAs and/or small files, so we wanted to
get some buy-in from the ZFS developers on an approach that would
be suitable for ZFS also.  In particular, being able to use the larger
dnode space for a variety of reasons (more elements in dn_blkptr[],
small file data, fast EA space) is much more desirable than a Lustre-only
implementation.

Also, given that we'd want to be able to access the EAs via ZPL if
mounted as ZFS would be important for debugging/backup/restore/etc.

I suspect the Lustre development approach would be the same with ZFS
as it is with ext3, which has been quite successful to this point.
Namely, we're happy to develop new functionality in ZFS/DMU as needed
so long as we get buy-in from the ZFS team on the design and most
importantly the on-disk format.  We don't want to create a permanent
fork in the code or on-disk format that separates Lustre-ZFS from
Solaris-ZFS, which is the whole point to starting this discussion long
before we're going to start implementing anything.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.


Reply via email to