Re: [Xen-devel] [PATCH v2] mm/page_alloc: make bootscrub happen in idle-loop

Sergey Dyasli Thu, 08 Nov 2018 06:49:36 -0800

(CCing Roger)

On 08/11/2018 11:07, Andrew Cooper wrote:
> On 08/11/18 10:31, Jan Beulich wrote:
>>>>> On 07.11.18 at 19:20, <andrew.coop...@citrix.com> wrote:
>>> On 09/10/18 16:21, Sergey Dyasli wrote:
>>>> Scrubbing RAM during boot may take a long time on machines with lots
>>>> of RAM. Add 'idle' option to bootscrub which marks all pages dirty
>>>> initially so they will eventually be scrubbed in idle-loop on every
>>>> online CPU.
>>>>
>>>> It's guaranteed that the allocator will return scrubbed pages by doing
>>>> eager scrubbing during allocation (unless MEMF_no_scrub was provided).
>>>>
>>>> Use the new 'idle' option as the default one.
>>>>
>>>> Signed-off-by: Sergey Dyasli <sergey.dya...@citrix.com>
>>> This patch reliably breaks boot, although its not immediately obvious how:
>>>
>>> (d9) (XEN) mcheck_poll: Machine check polling timer started.
>>> (d9) (XEN) xenoprof: Initialization failed. Intel processor family 6 model 
>>> 60 is not supported
>>> (d9) (XEN) Dom0 has maximum 400 PIRQs
>>> (d9) (XEN) ----[ Xen-4.12-unstable  x86_64  debug=y   Not tainted ]----
>>> (d9) (XEN) CPU:    0
>>> (d9) (XEN) RIP:    e008:[<ffff82d080440ddb>] setup.c#cmdline_cook+0x1d/0x77
>>> (d9) (XEN) RFLAGS: 0000000000010282   CONTEXT: hypervisor
>>> (d9) (XEN) rax: ffff82d080406bdc   rbx: ffff8300c2c2c2c2   rcx: 
>>> 0000000000000000
>>> (d9) (XEN) rdx: 00000007c7ffffff   rsi: ffff83000045c24b   rdi: 
>>> ffff83000045c24b
>>> (d9) (XEN) rbp: ffff82d0804b7da8   rsp: ffff82d0804b7d98   r8:  
>>> ffff83003f057000
>>> (d9) (XEN) r9:  7fffffffffffffff   r10: 0000000000000000   r11: 
>>> 0000000000000001
>>> (d9) (XEN) r12: ffff83003f0d8100   r13: 0000000000000000   r14: 
>>> ffff82d0805f33d0
>>> (d9) (XEN) r15: 0000000000000002   cr0: 000000008005003b   cr4: 
>>> 00000000001526e0
>>> (d9) (XEN) cr3: 000000003fea7000   cr2: ffff8300c2c2c2c2
>>> (d9) (XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 
>>> 0000000000000000
>>> (d9) (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
>>> (d9) (XEN) Xen code around <ffff82d080440ddb> 
>>> (setup.c#cmdline_cook+0x1d/0x77):
>>> (d9) (XEN)  05 5e fc ff 48 0f 44 d8 <80> 3b 20 75 09 48 83 c3 01 80 3b 20 
>>> 74 f7 80 3d
>>> (d9) (XEN) Xen stack trace from rsp=ffff82d0804b7d98:
>>> [...]
>>> (d9) (XEN) Xen call trace:
>>> (d9) (XEN)    [<ffff82d080440ddb>] setup.c#cmdline_cook+0x1d/0x77
>>> (d9) (XEN)    [<ffff82d080443b7f>] __start_xen+0x259c/0x292d
>>> (d9) (XEN)    [<ffff82d0802000f3>] __high_start+0x53/0x55
>> That's apparently the 2nd cmdline_cook() invocation, when producing
>> the Dom0 command line. I would suppose what "loader" points to has
>> been scrubbed by the time we get there (with synchronous scrubbing
>> APs wouldn't be able to get going with this before reaching
>> heap_init_late()).
> 
> This is via a PVH boot (like a lot of my development work), and does
> look to be a latent use-after-free.  Dropping the VM down to a single
> vcpu causes the problem to go away.
> 
> Sergey is kindly investigating.


Yes, this seems to be a bug in Xen PVH boot path. From the serial:

(XEN) == mbi->mods_addr 0x46dce0

which is marked as usable in e820:

(XEN) PVH-e820 RAM map:
(XEN)  0000000000000000 - 00000000000a0000 (usable)
(XEN)  0000000000100000 - 0000000040000400 (usable)
(XEN)  00000000fc000000 - 00000000fc009040 (ACPI data)
(XEN)  00000000feff8000 - 00000000feffc000 (reserved)
(XEN)  00000000feffc000 - 00000000feffd000 (usable)
(XEN)  00000000feffd000 - 00000000ff000000 (reserved)

This memory is then given to the allocator and scrubbed by secondary
CPUs which leads to use-after-free. Even with fixing the cmdline issue,
another FATAL PAGE FAULT occurs further down the boot path:

(d16) [183465.829440] (XEN) Xen call trace:
(d16) [183465.829467] (XEN)    [<ffff82d08023d6c5>] memcmp+0x9/0x3a
(d16) [183465.829494] (XEN)    [<ffff82d080436702>]
bzimage.c#bzimage_check+0x32/0x71
(d16) [183465.829511] (XEN)    [<ffff82d080436806>] bzimage_parse+0x22/0xba
(d16) [183465.829528] (XEN)    [<ffff82d080431086>]
dom0_build.c#pvh_load_kernel+0x82/0x3c0
(d16) [183465.829612] (XEN)    [<ffff82d0804316e0>]
dom0_construct_pvh+0x1c9/0x11bf
(d16) [183465.829638] (XEN)    [<ffff82d0804387a6>]
construct_dom0+0xd4/0xb0e
(d16) [183465.829655] (XEN)    [<ffff82d0804280cc>]
__start_xen+0x2631/0x28b6
(d16) [183465.829682] (XEN)    [<ffff82d0802000f3>] __high_start+0x53/0x55
...
(XEN) Faulting linear address: ffff8f2c2d301202

Looking at mod[0].pa in PVH start info, I suspect that it also gets
overwritten:

(XEN) PVH start info: (pa 0000ffc0)
(XEN)   version:    1
(XEN)   flags:      0
(XEN)   nr_modules: 1
(XEN)   modlist_pa: 000000000000ff70
(XEN)   cmdline_pa: 000000000000ff90
(XEN)   cmdline:    'console=xen,pv dom0=pvh xsm=flask'
(XEN)   rsdp_pa:    00000000fc009000
(XEN)     mod[0].pa:         00000000005b1000
(XEN)     mod[0].size:       0000000004784128
(XEN)     mod[0].cmdline_pa: 0000000000000000

The issue is easily reproduced by running Xen as a PVH guest with the
following config:

type="pvh"

vcpus=2
memory=1024
nestedhvm=1

kernel="/root/xen-syms"
ramdisk="/boot/vmlinuz-4.4.0+10"
cmdline="console=xen,pv dom0=pvh xsm=flask"

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] mm/page_alloc: make bootscrub happen in idle-loop

Reply via email to