Hello,

I believe the following is true, correct me if it is not:

If more than one objects reference a block (e.g. 2 files have the same block 
open)
there must be multiple clones of the arc_buf_t ( and associated dmu_impl_t ) 
records 
present, one for each of the objects.  This is always so, even if the block is 
not
modified, "just in case the block a should end up being modified".
So: if there are 100 files  accessing the same block in the same txg, there 
will be 
100 clones of the data, even if none of the files ultimately modifies this 
block.  
Seems a bit wasteful.

This dos not feel like COW to me, rather, "copy always, just in case" at least 
in the arc/dmu realm.
I fail to see why the above scenario should not be able to  get by with a 
single,
shared, reference counted record. A clone would only have to be made of a block 
if
a given file decides to modify the block.   As it is, reference counting is 
significantly 
complicated by mixing it with this pre-cloning. 

On to some code comprehension questions:

It seems to be that the conceptual model of a file in the dmu layer:
A number of dmu buffers, hanging off of a dnode (i.e the per-dnode the list 
formed via the db_link "list enabler"). Not all blocks of the file are in this 
list, 
only the "active" ones.  I take "active" to mean "recently accessed".

There is a somewhat opaque aspect to dmu, that is missing from the otherwise 
excellent data structure chart.  I am talking about dirty buffer management.   

db_data_pending?  db_last_dirty?  db_dirtycnt?  Could someone provide the 
10K mile overview on dirty buffers?


The dbuf_states are a bit of a mystery: 

What is the difference between "DB_READ" and "DB_FILL"? 

My guess, maybe the data is coming from a different direction into the cache.
>From below: Read from disk, (maybe) 
>From above: Nascent data coming from an application (newly created data?).

I am guessing DB_NOFILL is a short-circuit path to throw obsoleted data away. 
It would be nice to comment the states ( beyond an unexplained  state transition
diagramm.

ZFS would be  more approachable to newcomers if the code was 
a bit more commented.  
I am not talking about copious comments, just every field 
in the major data structures, and minimum a one-liner per function as to what 
the function does.    

Yes, given enough perseverance and a lot of time one can figure
everything out from studying the usage patterns but the pain of this 
could be lessened.  

The more people understand ZFS, the stronger it will become.
-- 
This message posted from opensolaris.org

Reply via email to