On Wed, Sep 14, 2011 at 4:31 PM, Richard Elling
>> From watching the system try to import this pool, it looks like it is
>> still building a kernel structure in RAM when the system runs out of
>> RAM. It has not committed anything to disk.
> Did you experience a severe memory shortfall?
> (Do you know how to determine that condition?)
T2000 with 32 GB RAM
zpool that hangs the machine by running it out of kernel memory when
trying to import the zpool
zpool has an "incomplete" snapshot from a zfs recv that it is trying
to destroy on import
I *can* import the zpool readonly
So the answer is yes to the severe memory shortfall. One of the
many things I did to instrument this system was as simple as running
vmstat 10 on the console :-) The last instance before the system hung
showed a scan rate of 900,000 ! In one case I watched as it hung (it
has done this many times as I have troubleshot with Oracle Support)
and did not see *any* user level processes that would account for the
memory shortfall. I have logs of system freemem showing the memory
exhaustion. Oracle Support has confirmed (from a core dump) that it is
some combination of the two bugs you mentioned (plus they created a
new Bug ID for this specific problem).
I have asked multiple times if the incomplete snapshot could be
corrupt in a way that would cause this (early on then led us to
believe the incomplete snapshot was 7 TB when it should be about 2.5
TB), but have not gotten anything substantive back (just a one line,
"The snapshot is not corrupt.").
What I am looking for is a way to estimate the kernel memory
necessary to destroy a given snapshot so that I can see if any of the
snapshots on my production server (M4000 with 16 GB) will run the
machine out of memory.
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Designer: Frankenstein, A New Musical
-> Sound Coordinator, Schenectady Light Opera Company (
-> Technical Advisor, RPI Players
zfs-discuss mailing list