Re: [PATCHv2] vgaarb: Add module param to allow for choosing the boot VGA device

From: Alex Williamson
Date: Tue Jul 05 2022 - 12:15:50 EST


On Mon, 4 Jul 2022 16:38:29 -0500
Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:

> [+cc Alex, Cornelia, kvm]
>
> On Mon, Jul 04, 2022 at 05:12:44PM -0400, Cal Peake wrote:
> > Add module parameter 'bootdev' to the VGA arbiter to allow the user
> > to choose which PCI device should be selected over any others as the
> > boot VGA device.
> >
> > When using a multi-GPU system with one or more GPUs being used in
> > conjunction with VFIO for passthrough to a virtual machine, if the
> > VGA arbiter settles on a passthrough GPU as the boot VGA device,
> > once the VFIO PCI driver claims that GPU, all display output is lost
> > and the result is blank screens and no VT access.
>
> I cc'd KVM folks in case they have anything to add here because I'm
> not a VFIO passthrough expert.
>
> It sounds like the problem occurs when the VFIO driver claims the GPU.
> I assume that happens after boot, when setting up for the virtual
> machine? If so, is there a way to avoid the problem at run-time so
> the admin doesn't have to decide at boot-time which GPU will be passed
> through to a VM? Is it possible or desirable to pass through GPU A to
> VM A, then after VM A exits, pass through GPU B to VM B?
>
> > Signed-off-by: Cal Peake <cp@xxxxxxxxxxxxxxxxxxx>
> > ---
> > .../admin-guide/kernel-parameters.txt | 7 ++++
> > drivers/pci/vgaarb.c | 40 +++++++++++++++++++
> > 2 files changed, 47 insertions(+)
> >
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > index 2522b11e593f..21ac87f4a8a9 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -6518,6 +6518,13 @@
> > This is actually a boot loader parameter; the value is
> > passed to the kernel using a special protocol.
> >
> > + vgaarb.bootdev= [PCI] Specify the PCI ID (e.g. 0e:00.0) of the
> > + device to use as the boot VGA device, overriding
> > + the heuristic used to normally determine which
> > + of the eligible VGA devices to use. If the device
> > + specified is not valid or not eligible, then we
> > + fallback to the heuristic.
> > +
> > vm_debug[=options] [KNL] Available with CONFIG_DEBUG_VM=y.
> > May slow down system boot speed, especially when
> > enabled on systems with a large amount of memory.
> > diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
> > index f80b6ec88dc3..d3689b7dc63d 100644
> > --- a/drivers/pci/vgaarb.c
> > +++ b/drivers/pci/vgaarb.c
> > @@ -35,6 +35,34 @@
> >
> > #include <linux/vgaarb.h>
> >
> > +static char *bootdev __initdata;
> > +module_param(bootdev, charp, 0);
> > +MODULE_PARM_DESC(bootdev, "Force boot device to the specified PCI ID");
> > +
> > +/*
> > + * Initialize to the last possible ID to have things work as normal
> > + * when no 'bootdev' option is supplied. We especially do not want
> > + * this to be zero (0) since that is a valid PCI ID (00:00.0).
> > + */
> > +static u16 bootdev_id = 0xffff;
> > +
> > +static void __init parse_bootdev(char *input)
> > +{
> > + unsigned int bus, dev, func;
> > + int ret;
> > +
> > + if (input == NULL)
> > + return;
> > +
> > + ret = sscanf(input, "%x:%x.%x", &bus, &dev, &func);
> > + if (ret != 3) {
> > + pr_warn("Improperly formatted PCI ID: %s\n", input);
> > + return;
> > + }

See pci_dev_str_match()

> > +
> > + bootdev_id = PCI_DEVID(bus, PCI_DEVFN(dev, func));
> > +}
> > +
> > static void vga_arbiter_notify_clients(void);
> > /*
> > * We keep a list of all vga devices in the system to speed
> > @@ -53,6 +81,7 @@ struct vga_device {
> > bool bridge_has_one_vga;
> > bool is_firmware_default; /* device selected by firmware */
> > unsigned int (*set_decode)(struct pci_dev *pdev, bool decode);
> > + bool is_chosen_one; /* device specified on command line */
> > };
> >
> > static LIST_HEAD(vga_list);
> > @@ -605,6 +634,7 @@ static bool vga_is_boot_device(struct vga_device *vgadev)
> >
> > /*
> > * We select the default VGA device in this order:
> > + * User specified device (see module param bootdev=)
> > * Firmware framebuffer (see vga_arb_select_default_device())
> > * Legacy VGA device (owns VGA_RSRC_LEGACY_MASK)
> > * Non-legacy integrated device (see vga_arb_select_default_device())
> > @@ -612,6 +642,14 @@ static bool vga_is_boot_device(struct vga_device *vgadev)
> > * Other device (see vga_arb_select_default_device())
> > */
> >
> > + if (boot_vga && boot_vga->is_chosen_one)
> > + return false;
> > +
> > + if (bootdev_id == PCI_DEVID(pdev->bus->number, pdev->devfn)) {
> > + vgadev->is_chosen_one = true;
> > + return true;
> > + }

This seems too simplistic, for example PCI code determines whether the
ROM is a shadow ROM at 0xc0000 based on whether it's the
vga_default_device() where that default device is set in
vga_arbiter_add_pci_device() based on the value returned by
this vga_is_boot_device() function. A user wishing to specify the boot
VGA device doesn't magically make that device's ROM shadowed into this
location.

I also don't see how this actually enables VGA routing to the user
selected device, where we generally expect the boot device already has
this enabled.

Furthermore, what's the initialization state of the selected device, if
it has not had its option ROM executed, is it necessarily in a state to
accept VGA commands? If we're changing the default VGA device, are we
fully uncoupling from any firmware notions of the console device?
Thanks,

Alex


> > +
> > /*
> > * We always prefer a firmware default device, so if we've already
> > * found one, there's no need to consider vgadev.
> > @@ -1544,6 +1582,8 @@ static int __init vga_arb_device_init(void)
> > int rc;
> > struct pci_dev *pdev;
> >
> > + parse_bootdev(bootdev);
> > +
> > rc = misc_register(&vga_arb_device);
> > if (rc < 0)
> > pr_err("error %d registering device\n", rc);
> > --
> > 2.35.3
> >
>