George Dunlap writes ("[PATCH v2 1/6] docs/qemu-deprivilege: Revise and update 
with status and future plans"):
> +## Xen library / file-descriptor restrictions
> +
> +'''Description''': Close and restrict Xen-related file descriptors.
> +Specifically:
> + * Close all xenstore-related file descriptors

This is correct.

> + * Make sure that extraneous `privcmd` and `evtchn` instances are
> +closed

No, *all* privcmd and evtchn instances are restricted, even
`extraneous' ones which have been leaked by qemu.  None are closed.

> +'''How to test''':
> +
> +    tools/test/depriv/depriv-fd-checker.c

You also need the tool `fishdescriptor' from src:chiark-utils to get
the descriptors out of qemu.  It is in chiark-utils-bin in Debian
buster and Debian stretch-backports.

> +## Namespaces for unused functionality (Linux only)
> +
> +'''Description''': Enter QEMU into its own mount & IPC namespaces.
> +This means that even if other restrictions fail, the process won't be
> +able to even name system mount points or exsting non-file-based IPC
> +descriptors to attempt to attack them.
> +
> +'''Implementation''':
> +
> +In theory this could be done in QEMU (similar to -sandbox, -runas,
> +-chroot, and so on), but a patch doing this in QEMU was NAKed
> +upstream. They preferred that this was done as a setup step by
> +whatever executes QEMU; i.e., have the process which exec's QEMU first
> +call:
> +
> +    unshare(CLONE_NEWNS | CLONE_NEWIPC)

If you are recording this kind of information here: this will of
course not work, because qemu binds and opens things at startup that
would be broken by this.  Maybe you want to give a url to a mailing
list posting instead of this un-referenced hearsay.

> +### Network namespacing (Linux only)
> +
> +Enter QEMU into its own network namespace (in addition to mount & IPC
> +namespaces).  Basically change the 'unshare' call to be as follows:
> +
> +    unshare(CLONE_NEWNET | CLONE_NEWNS | CLONE_NEWIPC)
> +
> +QEMU does actually use the network namespace by default, so adding
> +this restriction requires additional changes, listed below.

The CLONE_NEWIPC overlaps with the IPC unshare discussed above.

> +## Setting up a userid range

There was some discussion on a Debian list recently about some
container systems that encode a 16-bit within-container uid and a
16-bit container number into the 32-bit uid.  I guess we don't need to
explicitly worry about clashes between our usage and those ?

> +# Limitations
> +
> +The following features still need to be implemented:
> + * Inserting a new cdrom while the guest is running (xl cdrom-insert)
> + * Migration / save / restore
> +
> +Additionally, getting PCI passthrough to work securely would require a
> +significant rework of how passthrough works at the moment.  It may be
> +implemented at some point but is not a near-term priority.

The limitations section should also say something like this:

 The currently implemented restrictions are thought to be a useful
 security improvement.  However, the design and implementation is
 preliminary and there is work left to do.  Accordingly we do not
 promise that they are sufficient to stop a rogue domain which takes
 control of its qemu from escaping into the host, let alone stop it
 from denying service to the host.

 Therefore, bugs which affect the effectiveness of the qemu depriv
 mechanisms will be treated as plain bugs, not security bugs; they
 would not result in a Xen Project Security Advisory.  However, bugs
 where the security of a system with dm_restrict=1 is worse than
 before, will be treated as security bugs.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to