On 14/02/18 07:54, Juergen Gross wrote:
> On 13/02/18 20:45, Andrew Cooper wrote:
>> The current XPTI implementation isolates the directmap (and therefore a lot
>> guest data), but a large quantity of CPU0's state (including its stack)
>> remains visible.
>> Furthermore, an attacker able to read .text is in a vastly superior position
>> to normal when it comes to fingerprinting Xen for known vulnerabilities, or
>> scanning for ROP/Spectre gadgets.
>> Collect together the entrypoints in .text.entry (currently 3x4k frames, but
>> can almost certainly be slimmed down), and create a common mapping which is
>> inserted into each per-cpu shadow. The stubs are also inserted into this
>> mapping by pointing at the in-use L2. This allows stubs allocated later (SMP
>> boot, or CPU hotplug) to work without further changes to the common mappings.
>> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
>> CC: Jan Beulich <jbeul...@suse.com>
>> CC: Wei Liu <wei.l...@citrix.com>
>> CC: Juergen Gross <jgr...@suse.com>
>> RFC, because I don't think the stubs handling is particularly sensible.
>> We allocate 4k of virtual address space per CPU, but squash loads of CPUs
>> together onto a single MFN. The stubs ought to be isolated as well (as they
>> leak the virtual addresses of each stack), which can be done by allocating an
>> MFN per CPU (and simplifies cpu_smpboot_alloc() somewhat). At this point, we
>> can't use a common set of mappings, and will have to clone the single stub
>> .entry.text into each PCPUs copy of the pagetables.
>> Also, my plan to cause .text.entry to straddle a 512TB boundary (and
>> avoid any further pagetable allocations) has come a little unstuck because of
>> CONFIG_BIGMEM. I'm still working out whether there is a sensible way to
>> rearrange the virtual layout for this plan to work.
>> xen/arch/x86/smpboot.c | 37
>> xen/arch/x86/x86_64/compat/entry.S | 2 ++
>> xen/arch/x86/x86_64/entry.S | 4 +++-
>> xen/arch/x86/xen.lds.S | 7 +++++++
>> 4 files changed, 44 insertions(+), 6 deletions(-)
>> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
>> index 2ebef03..2519141 100644
>> --- a/xen/arch/x86/smpboot.c
>> +++ b/xen/arch/x86/smpboot.c
>> @@ -622,6 +622,9 @@ unsigned long alloc_stub_page(unsigned int cpu, unsigned
>> long *mfn)
>> unmap_domain_page(memset(__map_domain_page(pg), 0xcc, PAGE_SIZE));
>> + /* Confirm that all stubs fit in a single L2 pagetable. */
>> + BUILD_BUG_ON(NR_CPUS * PAGE_SIZE > (1u << L2_PAGETABLE_SHIFT));
> So we limit NR_CPUS to be max 512 now?
Not intentionally. The PAGE_SIZE should be dropped. (One full L2
pagetable allows us to map 512*512 pages).
> Maybe you should use STUB_BUF_SIZE instead of PAGE_SIZE?
No - that would be incorrect because of the physical packing of stubs
> BTW: Is there any reason we don't use a common stub page mapped to each
> per-cpu stack area? The stack address can easily be obtained via %rip
> relative addressing then (see my patch for the per-vcpu stacks:
> https://lists.xen.org/archives/html/xen-devel/2018-02/msg00773.html )
I don't understand what you are asking here. We cannot access the
per-cpu area with plain rip-retaliative addressing without using gs base
(and we really don't want to go down that route), or without per-cpu
pagetables (which would have to be a compile time choice).
As for why the per-cpu areas aren't mapped, that's because they aren't
needed at the moment. Any decision to change this needs to weigh the
utility of mapping the areas vs the additional data leakage, which is
Xen-devel mailing list