I have two questions for zfs code:

In dbuf_hold_imple():

        /*                     
         * If this buffer is currently syncing out, and we are are
         * still referencing it from db_data, we need to make a copy
         * of it in case we decide we want to dirty it again in this txg.
         */                    
        if (db->db_level == 0 && db->db_blkid != DB_BONUS_BLKID &&
            dn->dn_object != DMU_META_DNODE_OBJECT &&
            db->db_state == DB_CACHED && db->db_data_pending) {
                dbuf_dirty_record_t *dr = db->db_data_pending;
               
                if (dr->dt.dl.dr_data == db->db_buf) {
                        arc_buf_contents_t type = DBUF_GET_BUFC_TYPE(db);
       
                        dbuf_set_data(db,
                            arc_buf_alloc(db->db_dnode->dn_objset->os_spa,
                            db->db.db_size, db, type));
                        bcopy(dr->dt.dl.dr_data->b_data, db->db.db_data,
                            db->db.db_size);
                }              
        }

In this piece of code, it will copy the data to the extra buffer if the 
current one is syncing into disks. My question is that why not deferring this 
data copy to dbuf_dirty, where we check the db_data_pending and do the copy. 
The similar case is in dbuf_sync_leaf():

        if (dn->dn_object != DMU_META_DNODE_OBJECT) {
                /*
                 * If this buffer is currently "in use" (i.e., there are
                 * active holds and db_data still references it), then make
                 * a copy before we start the write so that any 
modifications
                 * from the open txg will not leak into this write.
                 *
                 * NOTE: this copy does not need to be made for objects only
                 * modified in the syncing context (e.g. DNONE_DNODE 
blocks).
                 */
                if (refcount_count(&db->db_holds) > 1 && *datap == 
db->db_buf) {
                        arc_buf_contents_t type = DBUF_GET_BUFC_TYPE(db);
                        *datap = arc_buf_alloc(os->os_spa, blksz, db, type);
                        bcopy(db->db.db_data, (*datap)->b_data, blksz);
                }
        } else {

Here it copies the data also if there are extra reference count on the 
syncing dbuf. What I don't understand is why it need to do this 
operations since dbuf_dirty can handle this scenario perfectly.

And the further question, perhaps related to the above one, why does zfs 
need to release the arc buffer in dbuf_dirty? The comment says this is 
needed to protect other from reading the cached data block. But I don't 
know if there is other calling patch to hit the arc buffer except via 
dmu interface. Perhaps this is needed to protect the snapshot from reading the 
new data?
                } else if (db->db.db_object != DMU_META_DNODE_OBJECT) {
                        /*
                         * Release the data buffer from the cache so that we
                         * can modify it without impacting possible 
other users
                         * of this cached data block.  Note that indirect
                         * blocks and private objects are not released 
until the
                         * syncing state (since they are only modified 
then).
                         */
                        arc_release(db->db_buf, db);
                        dbuf_fix_old_data(db, tx->tx_txg);
                        data_old = db->db_buf;
                }

Thanks,
Jay
--
This messages posted from opensolaris.org

Reply via email to