So ... The way things presently are, ideally you would know in advance what stuff you were planning to write that has duplicate copies. You could enable dedup, then write all the stuff that's highly duplicated, then turn off dedup and write all the non-duplicate stuff. Obviously, however, this is a fairly implausible actual scenario.
In reality, while you're writing, you're going to have duplicate blocks mixed in with your non-duplicate blocks, which fundamentally means the system needs to be calculating the cksums and entering into DDT, even for the unique blocks... Just because the first time the system sees each duplicate block, it doesn't yet know that it's going to be duplicated later. But as you said, after data is written, and sits around for a while, the probability of duplicating unique blocks diminishes over time. So they're just a burden. I would think, the ideal situation would be to take your idea of un-dedup for unique blocks, and take it a step further. Un-dedup unique blocks that are older than some configurable threshold. Maybe you could have a command for a sysadmin to run, to scan the whole pool performing this operation, but it's the kind of maintenance that really should be done upon access, too. Somebody goes back and reads a jpg from last year, system reads it and consequently loads the DDT entry, discovers that it's unique and has been for a long time, so throw out the DDT info. But, by talking about it, we're just smoking pipe dreams. Cuz we all know zfs is developmentally challenged now. But one can dream... finglonger _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss