Hi Wei,
Thank you very much for comments. Please see my reply below.
On 7/17/2017 9:16 PM, Wei Liu wrote:
Hi Kai
Thanks for this nice write-up.
Some comments and questions below.
On Sun, Jul 09, 2017 at 08:03:10PM +1200, Kai Huang wrote:
Hi all,
[...]
2. SGX Virtualization Design
2.1 High Level Toolstack Changes:
2.1.1 New 'epc' parameter
EPC is limited resource. In order to use EPC efficiently among all domains,
when creating guest, administrator should be able to specify domain's virtual
EPC size. And admin
alao should be able to get all domain's virtual EPC size.
For this purpose, a new 'epc = <size>' parameter is added to XL configuration
file. This parameter specifies guest's virtual EPC size. The EPC base address
will be calculated by toolstack internally, according to guest's memory size,
MMIO size, etc. 'epc' is MB in unit and any 1MB aligned value will be accepted.
2.1.2 New XL commands (?)
Administrator should be able to get physical EPC size, and all domain's virtual
EPC size. For this purpose, we can introduce 2 additional commands:
# xl sgxinfo
Which will print out physical EPC size, and other SGX info (such as SGX1, SGX2,
etc) if necessary.
# xl sgxlist <did>
Which will print out particular domain's virtual EPC size, or list all virtual
EPC sizes for all supported domains.
Alternatively, we can also extend existing XL commands by adding new option
# xl info -sgx
Which will print out physical EPC size along with other physinfo. And
# xl list <did> -sgx
Which will print out domain's virtual EPC size.
Comments?
Can a guest have multiple EPC? If so, the proposed parameter is not good
enough.
According to SDM a machine may have multiple EPC, but it may have
doesn't mean it must have. EPC is typically reserved by BIOS as
Processor Reserved Memory (PRM), and in my understanding, client machine
doesn't need to have multiple EPC. Currently, I don't see why we need
to expose multiple EPC to guest. Even physical machine reports multiple
EPC, exposing one EPC to guest is enough. Currently SGX should not be
supported with virtual NUMA simultaneously for a single domain.
Can a guest with EPC enabled be migrated? The answer to this question
can lead to multiple other questions.
See the last section of my design. I saw you've already seen it. :)
Another question, is EPC going to be backed by normal memory? This is
related to memory accounting of the guest.
Although SDM says typically EPC is allocated by BIOS as PRM, but I think
we can just treat EPC as PRM, so I believe yes, physically EPC is backed
by normal memory. But EPC is reported as reserved memory in e820 table.
Is EPC going to be modeled as a device or another type of memory? This
is related to how we manage it in the toolstack.
I think we'd better to treat EPC as another type of memory. I am not
sure whether it should be modeled as device, as on real machine, EPC is
also exposed in ACPI table via "INT0E0C" device under \_SB (however it
is not modeled as PCIE device for sure).
Finally why do you not allow the users to specify the base address?
I don't see any reason why user needs to specify base address. If we do,
then specify what address? On real machine, BIOS set the base address,
and for VM, I think toolstack/Xen should do this.
In my RFC patches I didn't implement the commands as I don't know which
is better. In the github repo I mentioned at the beginning, there's an old
branch in which I implemented 'xl sgxinfo' and 'xl sgxlist', but they are
implemented via dedicated hypercall for SGX, which I am not sure whether is a
good option so I didn't include it in my RFC patches.
2.1.3 Notify domain's virtual EPC base and size to Xen
Xen needs to know guest's EPC base and size in order to populate EPC pages for
it. Toolstack notifies EPC base and size to Xen via XEN_DOMCTL_set_cpuid.
2.1.4 Launch Control Support (?)
[...]
But maybe integrating EPC to MM framework is more reasonable. Comments?
2.2.2 EPC Virtualization (?)
This part is how to populate EPC for guests. We have 3 choices:
- Static Partitioning
- Oversubscription
- Ballooning
IMHO static partitioning is good enough as a starting point.
Ballooning is nice to have but please don't make it mandatory. Not all
guests have balloon driver -- imagine a unikernel style secure domain
running with EPC.
That's good point. Thanks.
2.3 Additional Point: Live Migration, Snapshot Support (?)
Oh, here it is. Nice.
Actually from hardware's point of view, SGX is not migratable. There are two
reasons:
- SGX key architecture cannot be virtualized.
For example, some keys are bound to CPU. For example, Sealing key, EREPORT
key, etc. If VM is migrated to another machine, the same enclave will
derive
the different keys. Taking Sealing key as an example, Sealing key is
typically used by enclave (enclave can get sealing key by EGETKEY) to
*seal*
its secrets to outside (ex, persistent storage) for further use. If Sealing
key changes after VM migration, then the enclave can never get the sealed
secrets back by using sealing key, as it has changed, and old sealing key
cannot be got back.
- There's no ENCLS to evict EPC page to normal memory, but at the meaning
time, still keep content in EPC. Currently once EPC page is evicted, the
EPC
page becomes invalid. So technically, we are unable to implement live
migration (or check pointing, or snapshot) for enclave.
But, with some workaround, and some facts of existing SGX driver, technically
we are able to support Live migration (or even check pointing, snapshot). This
is because:
- Changing key (which is bound to CPU) is not a problem in reality
Take Sealing key as an example. Losing sealed data is not a problem,
because
sealing key is only supposed to encrypt secrets that can be provisioned
again. The typical work model is, enclave gets secrets provisioned from
remote (service provider), and use sealing key to store it for further use.
When enclave tries to *unseal* use sealing key, if the sealing key is
changed, enclave will find the data is some kind of corrupted (integrity
check failure), so it will ask secrets to be provisioned again from remote.
Another reason is, in data center, VM's typically share lots of data, and
as
sealing key is bound to CPU, it means the data encrypted by one enclave on
one machine cannot be shared by another enclave on another mahcine. So from
SGX app writer's point of view, developer should treat Sealing key as a
changeable key, and should handle lose of sealing data anyway. Sealing key
should only be used to seal secrets that can be easily provisioned again.
For other keys such as EREPORT key and provisioning key, which are used for
local attestation and remote attestation, due to the second reason below,
losing them is not a problem either.
- Sudden lose of EPC is not a problem.
On hardware, EPC will be lost if system goes to S3-S5, or reset, or
shutdown, and SGX driver need to handle lose of EPC due to power
transition.
This is done by cooperation between SGX driver and userspace SGX SDK/apps.
However during live migration, there may not be power transition in guest,
so there may not be EPC lose during live migration. And technically we
cannot *really* live migrate enclave (explained above), so looks it's not
feasible. But the fact is that both Linux SGX driver and Windows SGX driver
have already supported *sudden* lose of EPC (not EPC lose during power
transition), which means both driver are able to recover in case EPC is
lost
at any runtime. With this, technically we are able to support live
migration
by simply ignoring EPC. After VM is migrated, the destination VM will only
suffer *sudden* lose of EPC, which both Windows SGX driver and Linux SGX
driver are already able to handle.
But we must point out such *sudden* lose of EPC is not hardware behavior,
and other SGX driver for other OSes (such as FreeBSD) may not implement
this, so for those guests, destination VM will behavior in unexpected
manner. But I am not sure we need to care about other OSes.
Presumably it wouldn't be too hard for FreeBSD to replicate the
behaviour of Linux and Windows.
The problem is this is not hardware behavior. If FreeBSD guys just look
at the SDM then they may not expect such sudden lose of EPC. But I guess
maybe they will just port existing driver. :)
For the same reason, we are able to support check pointing for SGX guest (only
Linux and Windows);
For snapshot, we can support snapshot SGX guest by either:
- Suspend guest before snapshot (s3-s5). This works for all guests but
requires user to manually susppend guest.
- Issue an hypercall to destroy guest's EPC in save_vm. This only works for
Linux and Windows but doesn't require user intervention.
What's your comments?
IMHO it is of course good to have migration and snapshot support for
such guests.
Thanks. I have no problem supporting migration and snapshot if no one
opposes.
Thanks,
-Kai
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel