On Thu, Jul 20, 2023 at 12:32:33AM +0000, Volodymyr Babchuk wrote:
> From: Oleksandr Andrushchenko <[email protected]>
> 
> Xen and/or Dom0 may have put values in PCI_COMMAND which they expect
> to remain unaltered. PCI_COMMAND_SERR bit is a good example: while the
> guest's view of this will want to be zero initially, the host having set
> it to 1 may not easily be overwritten with 0, or else we'd effectively
> imply giving the guest control of the bit. Thus, PCI_COMMAND register needs
> proper emulation in order to honor host's settings.

You speak about SERR here, yet in the code all bits are togglable by
domUs.

> There are examples of emulators [1], [2] which already deal with PCI_COMMAND
> register emulation and it seems that at most they care about is the only INTx
                                                                      ^ stray?
> bit (besides IO/memory enable and bus master which are write through).
> It could be because in order to properly emulate the PCI_COMMAND register
> we need to know about the whole PCI topology, e.g. if any setting in device's
> command register is aligned with the upstream port etc.
> 
> This makes me think that because of this complexity others just ignore that.
> Neither I think this can easily be done in Xen case.
> 
> According to "PCI LOCAL BUS SPECIFICATION, REV. 3.0", section "6.2.2
> Device Control" the reset state of the command register is typically 0,
> so when assigning a PCI device use 0 as the initial state for the guest's view
> of the command register.
> 
> For now our emulation only makes sure INTx is set according to the host
> requirements, i.e. depending on MSI/MSI-X enabled state.
> 
> This implementation and the decision to only emulate INTx bit for now
> is based on the previous discussion at [3].
> 
> [1] https://github.com/qemu/qemu/blob/master/hw/xen/xen_pt_config_init.c#L310
> [2] 
> https://github.com/projectacrn/acrn-hypervisor/blob/master/hypervisor/hw/pci.c#L336
> [3] 
> https://patchwork.kernel.org/project/xen-devel/patch/[email protected]/
> 
> Signed-off-by: Oleksandr Andrushchenko <[email protected]>
> ---
> 
> Since v6:
> - fold guest's logic into cmd_write
> - implement cmd_read, so we can report emulated INTx state to guests
> - introduce header->guest_cmd to hold the emulated state of the
>   PCI_COMMAND register for guests
> Since v5:
> - add additional check for MSI-X enabled while altering INTX bit
> - make sure INTx disabled while guests enable MSI/MSI-X
> Since v3:
> - gate more code on CONFIG_HAS_MSI
> - removed logic for the case when MSI/MSI-X not enabled
> ---
>  xen/drivers/vpci/header.c | 38 +++++++++++++++++++++++++++++++++++++-
>  xen/drivers/vpci/msi.c    |  4 ++++
>  xen/drivers/vpci/msix.c   |  4 ++++
>  xen/include/xen/vpci.h    |  3 +++
>  4 files changed, 48 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
> index e1a448b674..ae05d242a5 100644
> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -486,11 +486,27 @@ static int modify_bars(const struct pci_dev *pdev, 
> uint16_t cmd, bool rom_only)
>      return 0;
>  }
>  
> +/* TODO: Add proper emulation for all bits of the command register. */
>  static void cf_check cmd_write(
>      const struct pci_dev *pdev, unsigned int reg, uint32_t cmd, void *data)
>  {
>      struct vpci_header *header = data;
>  
> +    if ( !is_hardware_domain(pdev->domain) )
> +    {
> +        struct vpci_header *header = data;

Why do you need this variable?  You already have 'header' in the outer
scope you can use here.

> +
> +        header->guest_cmd = cmd;
> +#ifdef CONFIG_HAS_PCI_MSI
> +        if ( pdev->vpci->msi->enabled || pdev->vpci->msix->enabled )
> +            /*
> +             * Guest wants to enable INTx, but it can't be enabled
> +             * if MSI/MSI-X enabled.
> +             */
> +            cmd |= PCI_COMMAND_INTX_DISABLE;
> +#endif
> +    }
> +
>      /*
>       * Let Dom0 play with all the bits directly except for the memory
>       * decoding one.

This comments likely needs updating, to reflect that bits not allowed
to domU are already masked.

> @@ -507,6 +523,19 @@ static void cf_check cmd_write(
>          pci_conf_write16(pdev->sbdf, reg, cmd);
>  }
>  
> +static uint32_t cmd_read(const struct pci_dev *pdev, unsigned int reg,
> +                         void *data)
> +{
> +    if ( !is_hardware_domain(pdev->domain) )
> +    {
> +        struct vpci_header *header = data;
> +
> +        return header->guest_cmd;
> +    }
> +
> +    return pci_conf_read16(pdev->sbdf, reg);

Would IMO be simpler as:

const struct vpci_header *header = data;

return is_hardware_domain(pdev->domain) ? pci_conf_read16(pdev->sbdf, reg)
                                        : header->guest_cmd;

In fact I wonder why not make this handler domU specific so that the
hardware domain can continue to use vpci_hw_read16.

> +}
> +
>  static void cf_check bar_write(
>      const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data)
>  {
> @@ -713,8 +742,15 @@ static int cf_check init_bars(struct pci_dev *pdev)
>          return -EOPNOTSUPP;
>      }
>  
> +    /*
> +     * According to "PCI LOCAL BUS SPECIFICATION, REV. 3.0", section "6.2.2
> +     * Device Control" the reset state of the command register is
> +     * typically all 0's, so this is used as initial value for the guests.
> +     */
> +    ASSERT(header->guest_cmd == 0);

Hm, while that would be the expectation, shouldn't the command register
reflect the current state of the hardware?

I think you want to check 'cmd' so it's sane, and complain otherwise but
propagate the value to the guest view.

> +
>      /* Setup a handler for the command register. */
> -    rc = vpci_add_register(pdev->vpci, vpci_hw_read16, cmd_write, 
> PCI_COMMAND,
> +    rc = vpci_add_register(pdev->vpci, cmd_read, cmd_write, PCI_COMMAND,
>                             2, header);

See comment above about keeping the hw domain using vpci_hw_read16.

>      if ( rc )
>          return rc;
> diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c
> index e63152c224..c37845a949 100644
> --- a/xen/drivers/vpci/msi.c
> +++ b/xen/drivers/vpci/msi.c
> @@ -70,6 +70,10 @@ static void cf_check control_write(
>  
>          if ( vpci_msi_arch_enable(msi, pdev, vectors) )
>              return;
> +
> +        /* Make sure guest doesn't enable INTx while enabling MSI. */
> +        if ( !is_hardware_domain(pdev->domain) )
> +            pci_intx(pdev, false);
>      }
>      else
>          vpci_msi_arch_disable(msi, pdev);
> diff --git a/xen/drivers/vpci/msix.c b/xen/drivers/vpci/msix.c
> index 9481274579..eab1661b87 100644
> --- a/xen/drivers/vpci/msix.c
> +++ b/xen/drivers/vpci/msix.c
> @@ -97,6 +97,10 @@ static void cf_check control_write(
>          for ( i = 0; i < msix->max_entries; i++ )
>              if ( !msix->entries[i].masked && msix->entries[i].updated )
>                  update_entry(&msix->entries[i], pdev, i);
> +
> +        /* Make sure guest doesn't enable INTx while enabling MSI-X. */
> +        if ( !is_hardware_domain(pdev->domain) )
> +            pci_intx(pdev, false);

I think here and in the MSI case you want to update the guest view of
the command register if you unconditionally disable INTx.

Maybe just use cmd_write() and let the logic there cache the new
value?

Thanks, Roger.

Reply via email to