Re: [zfs-discuss] Yager on ZFS

Robert Milkowski Tue, 11 Dec 2007 15:44:48 -0800

Hello can,

Tuesday, December 11, 2007, 6:57:43 PM, you wrote:

>> Monday, December 10, 2007, 3:35:27 AM, you wrote:
>> 
>> cyg>  and it 
>>>> made them slower
>> 
>> cyg> That's the second time you've claimed that, so you'll really at
>> cyg> least have to describe *how* you measured this even if the
>> cyg> detailed results of those measurements may be lost in the mists of time.
>> 
>> 
>> cyg> So far you don't really have much of a position to defend at
>> cyg> all:  rather, you sound like a lot of the disgruntled TOPS users
>> cyg> of that era.  Not that they didn't have good reasons to feel
>> cyg> disgruntled - but they frequently weren't very careful about aiming 
>> their ire accurately.
>> 
>> cyg> Given that RMS really was *capable* of coming very close to the
>> cyg> performance capabilities of the underlying hardware, your
>> cyg> allegations just don't ring true.  Not being able to jump into
>> 
>> And where is your "proof" that it "was capable of coming very close to
>> the..."?

cyg> It's simple:  I *know* it, because I worked *with*, and *on*, it
cyg> - for many years.  So when some bozo who worked with people with
cyg> a major known chip on their shoulder over two decades ago comes
cyg> along and knocks its capabilities, asking for specifics (not even
cyg> hard evidence, just specific allegations which could be evaluated
cyg> and if appropriate confronted) is hardly unreasonable.

Bill, you openly criticize people (their work) who have worked on ZFS
for years... not that there's anything wrong with that, just please
realize that because you were working on it it doesn't mean it is/was
perfect - just the same as with ZFS.
I know, everyone loves their baby...

Nevertheless just because you were working on and with it, it's not a
proof. The person you were replaying to was also working with it (but
not on it I guess). Not that I'm interested in such a proof. Just
noticed that you're demanding some proof, while you are also just
write some statements on its performance without any actual proof.

>> Let me use your own words:
>> 
>> "In other words, you've got nothing, but you'd like people to believe it's 
>> something.
>> 
>> The phrase "Put up or shut up" comes to mind."
>> 
>> Where are your proofs on some of your claims about ZFS?

cyg> Well, aside from the fact that anyone with even half a clue
cyg> knows what the effects of uncontrolled file fragmentation are on
cyg> sequential access performance (and can even estimate those
cyg> effects within moderately small error bounds if they know what
cyg> the disk characteristics are and how bad the fragmentation is),
cyg> if you're looking for additional evidence that even someone
cyg> otherwise totally ignorant could appreciate there's the fact that

I've never said there are not fragmentation problems with ZFS.
Well, actually I've been hit by the issue in one environment.
Also you haven't done your work home properly, as one of ZFS
developers actually stated they are going to work on ZFS
de-fragmentation and disk removal (pool shrinking).
See http://www.opensolaris.org/jive/thread.jspa?messageID=139680&#139680
Lukasz happens to be my friend who is also working with the same
environment.

The point is, and you as a long time developer (I guess) should know it,
you can't have everything done at once (lack of resources, and it takes
some time anyway) so you must prioritize. ZFS is open source and if
someone thinks that given feature is more important than the other
he/she should try to fix it or at least voice it here so ZFS
developers can possibly adjust their priorities if there's good enough
and justified demand.

Now the important part - quite a lot of people are using ZFS, from
desktop usage, their laptops, small to big production environments,
clustered environments, SAN environemnts, JBODs, entry-level to high-end arrays,
different applications, workloads, etc. And somehow you can't find
many complaints about ZFS fragmentation. It doesn't mean the problem
doesn't exist (and I know it first hand) - it means that for whatever
reason for most people using ZFS it's not a big problem if problem at
all. However they do have other issues and many of them were already
addressed or are being addressed. I would say that ZFS developers at
least try to listen to the community.

Why am I asking for a proof - well, given constrains on resources, I
would say we (not that I'm ZFS developer) should focus on actual
problems people have with ZFS rather then theoretical problems (which
in some environments/workloads will show up and sooner or later they
will have to be addressed too).

Then you find people like Pawel Jakub Davidek (guy who ported ZFS to
FreeBSD) who started experimenting with RAID-5 like implementation
with ZFS - he provided even some numbers showing it might be worth
looking at. That's what community is about.

I don't see any point complaining about ZFS all over again - have you
actually run into the problem with ZFS yourself? I guess not. You just
assuming (correctly for some usage cases). I guess your message has
been well heard. Since you're not interested in anything more that
bashing or complaining all the time about the same theoretical "issues" rather
than contributing somehow (even by providing some test results which
could be repeated) I wouldn't wait for any positive feedback if I were
you - anyway, what kind of feedback are you waiting for?

cyg> Last I knew, ZFS was still claiming that it needed nothing like
cyg> defragmentation, while describing write allocation mechanisms
cyg> that could allow disastrous degrees of fragmentation under
cyg> conditions that I've described quite clearly.

Well, I haven't talked to ZFS (yet) so I don't know what he claims :))
If you are talking about ZFS developers then you can actually find
some evidence that they do see that problem and want to work on it.
Again see for example: 
http://www.opensolaris.org/jive/thread.jspa?messageID=139680&#139680
Bill, at least look at the list archives first.

And again, "under conditions that I've described quite clearly." -
that's exactly the problem. You've just described something while
others do have actual and real problems which should be addressed
first.

cyg> If ZFS made no
cyg> efforts whatsoever in this respect the potential for unacceptable
cyg> performance would probably already have been obvious even to its
cyg> blindest supporters,

Well, is it really so hard to understand that a lot of people use ZFS
because it actually solves their problems? No matter what case
scenarios you will find to theoretically show some ZFS weaker points,
at the end what matters is if it does solve customer problems. And for
many users it does, definitely not for all of them.
I would argue that no matter what file system you will test or even
design, one can always find a corner cases when it will behave less
than optimal. For a general purpose file system what matters is that
in most common cases it's good enough.

cyg> willy-nilly in its batch disk writes) - and a lot of the time
cyg> there's probably not enough other system write activity to make
cyg> this infeasible, so that people haven't found sequential
cyg> streaming performance to be all that bad most of the time
cyg> (especially on the read end if their systems are lightly load
cyg>  ed and the fact that their disks may be working a lot harder
cyg> than they ought to have to is not a problem).

Now you closer to the point. If the problem you are describing does
not hit most people, we should put more effort solving problems which
people are actually experiencing.

cyg> Then there's RAID-Z, which smears individual blocks across
cyg> multiple disks in a manner that makes small-to-medium random
cyg> access throughput suck.  Again, this is simple logic and physics:
cyg> if you understand the layout and the disk characteristics, you
cyg> can predict the effects on a heavily parallel workload with
cyg> fairly decent accuracy (I think that Roch mentioned this casually
cyg> at one point, so it's hardly controversial, and I remember
cyg> reading a comment by Jeff Bonwick that he was pleased with the
cyg> result of one benchmark - which made no effort to demonstrate the
cyg> worst case - because the throughput penalty was 'only' a factor
cyg> of 2 rather than the full factor of N).

Yeah, nothing really new here. If you need a guy from Sun, then read
Roch's post on RAID-Z performance. Nothing you've discovered here.
Nevertheless RAID-Z[2] is good enough for many people.
I know that simple logic and physics states that relativity equations
provide better accuracy than Newton's - nevertheless in most scenarios
I'm dealing with it doesn't really matter from a practical point of
view.

Then, in some environments RAID-Z2 (on JBOD) actually provides better
performance than RAID-5 (and HW R5 for that matter). And, opposite
to you, I'm not speculating but I've been working with such
environment (lot of concurrent writes which are more critical than
much less reads later).
So when you saying that RAID-Z is brain-damaging - well, it's
mostly positive experience of a lot of people with RAID-Z vs. your statement 
without any
real-world backing.

Then, of course, one can produce a test or even real work environment
where there will be lot of small and simultaneous reads, without any
writes, from a dataset much bigger than available memory and RAID-Z would be 
much slower than
RAID-5. Not that it's novel - it was well discussed since RAID-Z was
introduced. That's one of the reasons why people use ZFS on-top of
HW RAID-5 luns from time to time.

cyg> And the way ZFS aparently dropped the ball on its alleged
cyg> elimination of any kind of 'volume management' by requiring that
cyg> users create explicit (and matched) aggregations of disks to
cyg> support mirroring and RAID-Z.

# mkfile 128m f1 ; mkfile 128m f2 ; mkfile 256m f3 ; mkfile 256m f4
# zpool create bill mirror /var/tmp/f1 /var/tmp/f2 mirror /var/tmp/f3 
/var/tmp/f4
# zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
bill                    373M     90K    373M     0%  ONLINE     -
#
# mkfile 128m f11 ; mkfile 256m f44
# zpool destroy bill
# zpool create bill raidz /var/tmp/f11 /var/tmp/f1 /var/tmp/f2 raidz 
/var/tmp/f3 /var/tmp/f4 /var/tmp/f44
# zfs list
NAME   USED  AVAIL  REFER  MOUNTPOINT
bill   101K   715M  32.6K  /bill
#
(2*128+2*256=768) - looks fine.

If you are talking about a solution which enables user to mix
different disk sizes in the same mirror or RAID-5 group and while all
the time providing given protection allows you to utilize 100% of all
disk capacities.... well, what is that solution? Is it free?
Open source? Available on general purpose OS? Or commodity HW?
Available at all? :P

cyg> Now, if someone came up with any kind of credible rebuttal to
cyg> these assertions we could at least discuss it on technical
cyg> grounds.  But (and again you should consider this significant) no
cyg> one has:  all we have is well-reasoned analysis on the one hand
cyg> and some (often fairly obnoxious) fanboy babble on the other.  If
cyg> you step back, make the effort required to *understand* that
cyg> analysis, and try to look at the situation objectively, which do you find 
more credible?

Most credible to me is actual user experience than some theoretical
burbling. While I appreciate it, and to some extend it's valid, for me
again most important is actual experience. Going endlessly all over
again, why ZFS is bad because you think fragmentation is a big issue,
while most of the actual users don't agree, is pointless imho.

Instead, try to do something positive and practical - for all the time
you spend on bashing ZFS, you probably would have already come up with
some basic proof-of-concept of flawless RAID-5 in ZFS or
fragmentation-free improvement, and once you've proved it actually is
promising everyone would love you and help you polishing the code.

:)))))))

cyg> ZFS has other deficiencies, but they're more fundamental choices
cyg> involving poor trade-offs and lack of vision than outright (and
cyg> easily rectifiable) flaws, so they could more justifiably be
cyg> termed 'judgment calls' and I haven't delved as deeply into them.

And what they are? What are the alternatives in a market?
Whie ZFS is not perfect, and for some people lack of user quotes is
no-go with ZFS, for many other it just doesn't make sense to go with
NetApp if only for economical reasons.

Whatever theoretical deficiencies you have in mind, I myself, and many
others, when confronted with ZFS in real world environments, I find it
most of the time much more flexible in managing storage than LVM, XFS,
UFS, VxVM/VxVF, NetApp. Also more secure, etc. And quite often similar
with similar performance or even better. Then thanks to zfs send|recv
I get really interesting backup option.

What some people are also looking for, I guess, is a black-box
approach - easy to use GUI on top of Solaris/ZFS/iSCSI/etc. So they
don't have to even know it's ZFS or Solaris. Well...

cyg> But they're the main reason I have no interest in 'working on'

Well, you're not using ZFS, you are not interested in working on it,
all you are interested is finding some potential corner cases bad for
ZFS and bashing it. If you put at least 10% of your energy you're
putting in your 'holy war' you would at least provide some benchmarks
(filebench?) showing these corner cases in comparison to other
mind-blowing solutions on the market which are much better than ZFS,
so we can all reproduce them and try to address ZFS problems.

:))))

[...]
>>And I seldom disappoint myself in that respect.

Honestly, I believe you - no doubt about it.

cyg> You really haven't bothered to read much at all, have you.  I've
cyg> said, multiple times, that I came here initially in the hope of
cyg> learning something interesting.  More recently, I came here
cyg> because I offered a more balanced assessment of ZFS's strengths
cyg> and weaknesses in responding to the Yager article and wanted to
cyg> be sure that I had not treated ZFS unfairly in some way - which
cyg> started this extended interchange.  After that, I explained that
cyg> while the likelihood of learning anything technical here was
cyg> looking pretty poor, I didn't particularly like some of the
cyg> personal attacks that I'd been subject to and had decided to confront them.

Well, every time I saw it was you 'attacking' other people first.
And it's just your opinion that you offered more balanced assessment
which is not shared by many.

If you are not contributing here, and you are not learning here - wy
are you here? I'm serious - why?
Wouldn't it better serve you to actually contribute to the other
project, where developers actually get it - where no one is personally
attacking you, where there are no fundamental bad choices made while
in design, where RAID-5 is flawless, fragmentation problem doesn't
exist neither all the other corner cases. Performance is best in a
market all the time, and I can run in on commodity HW or so called big
iron, on a well known general purpose OS. Well, I assume that project
is open source too - maybe you share with all of us that secret so we can
join it too and forget about ZFS? I'm first to "convert" and forget
about ZFS. Of course as that secret project is so perfect it probably
doesn't make sense to contribute as developer as there is nothing left
to contribute - but, hey - I'm not a developer - I'm user and I'm
definitely interested in that project.

cyg> No, my attitude is that people too stupid and/or too lazy to
cyg> understand what I *have* been delivering don't deserve much respect if 
they complain.

Maybe you should thing about that "stupid" part...
Maybe, just maybe, it's possible that all people around you don't
understand you, that world is wrong and we're all so stupid. Well,
maybe. Even if it is so, then perhaps it's time to stop being Don Quixote
and move on?

-- 
Best regards,
 Robert                            mailto:[EMAIL PROTECTED]
                                       http://milek.blogspot.com

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Yager on ZFS

Reply via email to