And therein lies the issue. The excessive load that causes the IO issues is 
almost always generated locally from a scrub or a local recursive "ls" used to 
warm up the SSD-based zpool cache with metadata. The regular network IO to the 
box is minimal and is very read-centric; once we load the box up with archived 
data (which generally happens in a short amount of time), we simply serve it 
out as needed.

As far as queueing goes, I would expect the system to queue bursts of IO in 
memory with appropriate timeouts, as required. These timeouts could either be 
manually or auto-magically adjusted to deal with the slower storage hardware. 
Obviously sustained intense IO requests would eventually blow up the queue so 
the goal here is to avoid creating those situations in the first place. We can 
throttle the network IO, if needed; I need the OS to know it's own local IO 
boundaries though and not attempt to overwork itself during scrubs etc.
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to