On 07/ 9/12 04:36 PM, Ian Collins wrote:
On 07/10/12 05:26 AM, Brian Wilson wrote:
Yep, thanks, and to answer Ian with more detail on what TruCopy does.
TruCopy mirrors between the two storage arrays, with software running on
the arrays, and keeps a list of dirty/changed 'tracks' while the mirror
is split. I think they call it something other than 'tracks' for HDS,
but, whatever.  When it resyncs the mirrors it sets the target luns
read-only (which is why I export the zpools first), and the source array
reads the changed tracks, and writes them across dedicated mirror ports
and fibre links to the target array's dedicated mirror ports, which then
brings the target luns up to synchronized. So, yes, like Richard says,
there is IO, but it's isolated to the arrays, and it's scheduled as
lower priority on the source array than production traffic. For example
it can take an hour or more to re-synchronize a particularly busy 250 GB
lun. (though you can do more than one at a time without it taking longer
or impacting production any more unless you choke the mirror links,
which we do our best not to do) That lower priority, dedicated ports on
the arrays, etc, all makes the noticaeble impact on the production
storage luns from the production server as un-noticable as I can make it
in my environment.

Thank you for the background on TruCopy. Reading the above, it looks like you can have pretty long time without a true copy! I guess my view on replication is you are always going to have X number of I/O operations and now dense they are depends on how up to date you want you're copy to be.

What I still don't understand is why a service interruption is preferable to a wee bit more I/O?


Sorry for the delayed answer. In this case it's less a matter of how much IO, as where the IO is. One thing I should mention is that during normal operations of TruCopy, the mirroring is synchronous - meaning the remote mirror array acknowledges every write before it's acknowledged to the host (battery backed cache keeps it from slowing down performance).

First, in this case the amount of nightly IO unfortunately isn't a 'wee bit', because the large database files that end up having to get backed up every night via TSM tie up a network connection for several hours. Secondly, the application doesn't support hot backup. The Oracle database does sure, however the application itself extensively uses and maintains 'keyword index' files external to the database that require a full application shutdown for a consistent backup. So, this is where taking a snapshot (in my case using array-to-array mirroring to do so) takes the nightly backup outage from the duration of hours for the backup to complete over the network, to a matter of minutes. So, while it is an outage, it's a very short one comparative to the options that are available with the application. (FYI - the application is Exlibris Group's Voyager software for libraries - I'm the primary admin for it for almost all campus libraries in Wisconsin).

Cheers,
Brian

--
-----------------------------------------------------------------------------------
Brian Wilson, Solaris SE, UW-Madison DoIT
Room 3114 CS&S            608-263-8047
brian.wilson(a)doit.wisc.edu
'I try to save a life a day. Usually it's my own.' - John Crichton
-----------------------------------------------------------------------------------

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to