I've got a bit of a strange problem with snapshot sizes. First, some
background:

For ages our DBA backed up all the company databases to a directory NFS
mounted from a NetApp filer. That directory would then get dumped to tape.

About a year ago, I built an OpenSolaris (technically Nexenta) machine with 24
x 1.5TB drives, for about 24TB of usable space. I am using this to backup OS
images using backuppc.

I was also backing up the DBA's backup volume from the NetApp to the (ZFS)
backup server. This is a combination of rsync + snapshots. The snapshots were
using about 50GB/day. The backup volume is about 600GB total, so this
wasn't bad, especially on a box with 24TB of space available.

I decided to cut out the middleman, and save some of that expensive NetApp
disk space, by having the DBA backup directly to the backup server. I
repointed the NFS mounts on our DB servers to point to the backup server
instead of the NetApp. Then I ran a simple cron job to snapshot that ZFS
filesystem daily.

My problem is that the snapshots started taking around 500GB instead of 50GB.
After a bit of thinking, I realized that the backup system my DBA was using
must have been writing new files and moving them into place, or possibly writing a whole new file even if only part changed. I think this is the problem because ZFS never overwrites files in place. Instead it would allocate new blocks. But rsync does a byte-by-byte comparison, and only updates the blocks that have changed.

Because it's easier to change what I'm doing than what my DBA does, I decided that I would put rsync back in place, but locally. So I changed things so that the backups go to a staging FS, and then are rsync'ed over to another FS that I take snapshots on. The only problem is that the snapshots are still in the 500GB range.

So, I need to figure out why these snapshots are taking so much more room than they were before.

This, BTW, is the rsync command I'm using (and essentially the same command I was using when I was rsync'ing from the NetApp):

rsync -aPH --inplace --delete /staging/oracle_backup/ /backups/oracle_backup/



This is the old system (rsync'ing from a NetApp and taking snapshots):
zfs list -t snapshot -r bpool/snapback
NAME                                       USED  AVAIL  REFER  MOUNTPOINT
...
bpool/snapb...@20100310-182713            53.7G      -   868G  -
bpool/snapb...@20100312-000318            59.8G      -   860G  -
bpool/snapb...@20100312-182552            54.0G      -   840G  -
bpool/snapb...@20100313-184834            71.7G      -   884G  -
bpool/snapb...@20100314-123024            17.5G      -   832G  -
bpool/snapb...@20100315-173609            72.6G      -   891G  -
bpool/snapb...@20100316-165527            24.3G      -   851G  -
bpool/snapb...@20100317-171304            56.2G      -   884G  -
bpool/snapb...@20100318-170250            50.9G      -   865G  -
bpool/snapb...@20100319-181131            53.9G      -   874G  -
bpool/snapb...@20100320-183617            80.8G      -   902G  -
...



This is from the new system (backing up directly to one volume, rsync'ing to and snapshotting another one):

r...@backup02:~# zfs list -t snapshot -r bpool/backups/oracle_backup
NAME                                          USED  AVAIL  REFER  MOUNTPOINT
bpool/backups/oracle_bac...@20100411-023130   479G      -   681G  -
bpool/backups/oracle_bac...@20100411-104428   515G      -   721G  -
bpool/backups/oracle_bac...@20100412-144700      0      -   734G  -


Thanks for any help,

Paul
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to