Hi Stefano,

On 30/11/16 21:53, Stefano Stabellini wrote:
On Mon, 28 Nov 2016, Julien Grall wrote:
If not, then it might be worth to consider a 3rd solution where the TEE
SMC
calls are forwarded to a specific domain handling the SMC on behalf of the
guests. This would allow to upgrade the TEE layer without having to
upgrade
the hypervisor.
Yes, this is good idea. How this can look? I imagine following flow:
Hypervisor traps SMC, uses event channel to pass request to Dom0. Some
userspace daemon receives it, maps pages with request data, alters is
(e.g. by replacing IPAs with PAs), sends request to hypervisor to
issue real SMC, then alters response and only then returns data back
to guest.


The event channel is only a way to notify (similar to an interrupt), you would
need a shared memory page between the hypervisor and the client to communicate
the SMC information.

I was thinking to get advantage of the VM event API for trapping the SMC. But
I am not sure if it is the best solution here. Stefano, do you have any
opinions here?

I can see only one benefit there - this code will be not in
hypervisor. And there are number of drawbacks:

Stability: if this userspace demon will crash or get killed by, say,
OOM, we will lose information about all opened sessions, mapped shared
buffers, etc.That would be complete disaster.

I disagree on your statement, you would gain in isolation. If your userspace
crashes (because of an emulation bug), you will only loose access to TEE for a
bit. If the hypervisor crashes (because of an emulation bug), then you take
down the platform. I agree that you lose information when the userspace app is
crashing but your platform is still up. Isn't it the most important?

Note that I think it would be "fairly easy" to implement code to reset
everything or having a backup on the side.

Performance: how big will be latency introduced by switching between
hypervisor, dom0 SVC and USR modes? I have seen use case where TEE is
part of video playback pipe (it decodes DRM media).
There also can be questions about security, but Dom0 in any case can
access any memory from any guest.

But those concerns would be the same in the hypervisor, right? If your
emulation is buggy then a guest would get access to all the memory.

But I really like the idea, because I don't want to mess with
hypervisor when I don't need to. So, how do you think, how it will
affect performance?

I can't tell here. I would recommend you to try a quick prototype (e.g
receiving and sending SMC) and see what would be the overhead.

When I wrote my previous e-mail, I mentioned "specific domain", because I
don't think it is strictly necessary to forward the SMC to DOM0. If you are
concern about overloading DOM0, you could have a separate service domain that
would handle TEE for you. You could have your "custom OS" handling TEE request
directly in kernel space (i.e SVC).

This would be up to the developer of this TEE-layer to decide what to do.

Thanks Julien from bringing me into the discussion. These are my
thoughts on the matter.


Running emulators in Dom0 (AKA QEMU on x86) has always meant giving them
full Dom0 privileges so far. I don't think that is acceptable. There is
work undergoing on the x86 side of things to fix the situation, see:

http://marc.info/?i=1479489244-2201-1-git-send-email-paul.durrant%40citrix.com

But if the past is any indication of future development speed, we are
still a couple of Xen releases away at least from having unprivileged
emulators in Dom0 on x86. By unprivileged, I mean that they are not able
to map any random page in memory, but just the ones belonging to the
virtual machine that they are serving. Until then, having an emulator in
userspace Dom0 is just as bad as having it in the hypervisor from a
security standpoint.

I would only consider this option, if we mandate from the start, in the
design doc and implementations, that the emulators need to be
unprivileged on ARM. This would likely require a new set of hypercalls
and possibly Linux privcmds. And even then, this solution would still
present a series of problems:

- latency
- scalability
- validation against the root of trust
- certifications (because they are part of Dom0 and nobody can certify
  that)


The other option that traditionally is proposed is using stubdoms.
Specialized little VMs to run emulators, each VM runs one emulator
instance. They are far better from a security standpoint, and could be
certifiable. They might still pose problems from a root of trust point
of view. However the real issue with stubdoms, is just that being
treated as VMs they show up in "xl list", they introduce latency, they
consume a lot of memory, etc. Also dealing with Mini-OS can be unfunny.
I think that this option is only a little better than the previous
option, but it is still not great.


This brings us to the third and last option. Introducing the emulators
in the hypervisor. This is acceptable only if they are run in a lower
privilege mode, so that a guest breaking into an emulator doesn't
compromise the system. An emulator crashing doesn't crash the host. In
other words, we need a way to run emulators in EL1 in Xen. Something
similar has been prototyped on x86:

http://marc.info/?l=xen-devel&m=146771783917640

It would allow us to run emulators securely from within the hypervisor,
offering the best performance and security trade-off. One day we could
even support loading new emulators from userspace, a la insmod. I think
this is the best option. It might be a bit more work. We have a chance
to learn from past mistakes and design the best possible solution from
the start here and we should take it.

In general, I agree that running the emulators in Xen in a lower privilege mode (EL1 or EL0) would be the best solution.

From what Volodymyr said the SMC call issued by the guest would need to be analyzed in order to translate IPA -> PA. We also need to forbid some SMC call from the guest where it will impact all TEE (e.g such as reset...).

In summary, we need a white-list in Xen to avoid the guest using an SMC that has been added in newer version of TEE but not been supported by the Hypervisor. So the hypervisor will be tight to a specific version of TEE.

This does not scale for a longer term support within the hypervisor.

That bring one question. In the case of OP-TEE, is there a convention of the parameter? I.e some kind of metada structure to give hint on the type of the parameter (e.g IPA...)?

Regards,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Reply via email to