Re: [PATCH v2 6/6] vgaarb: Look at all PCI display devices in VGA arbiter

From: Thomas Zimmermann
Date: Thu Jun 19 2025 - 03:00:20 EST


Hi

Am 18.06.25 um 20:45 schrieb Mario Limonciello:
On 6/18/2025 9:12 AM, Simona Vetter wrote:
On Wed, Jun 18, 2025 at 11:11:26AM +0200, Thomas Zimmermann wrote:
Hi

Am 17.06.25 um 22:22 schrieb Mario Limonciello:


On 6/17/25 2:22 PM, Alex Williamson wrote:
On Tue, 17 Jun 2025 12:59:10 -0500
Mario Limonciello <superm1@xxxxxxxxxx> wrote:

From: Mario Limonciello <mario.limonciello@xxxxxxx>

On a mobile system with an AMD integrated GPU + NVIDIA discrete GPU the
AMD GPU is not being selected by some desktop environments for any
rendering tasks. This is because neither GPU is being treated as
"boot_vga" but that is what some environments use to select a GPU [1].

The VGA arbiter driver only looks at devices that report as PCI display
VGA class devices. Neither GPU on the system is a PCI display VGA class
device:

c5:00.0 3D controller: NVIDIA Corporation Device 2db9 (rev a1)
c6:00.0 Display controller: Advanced Micro Devices, Inc.
[AMD/ATI] Device 150e (rev d1)

If the GPUs were looked at the vga_is_firmware_default()
function actually
does do a good job at recognizing the case from the device used for the
firmware framebuffer.

Modify the VGA arbiter code and matching sysfs file entries to
examine all
PCI display class devices. The existing logic stays the same.

This will cause all GPUs to gain a `boot_vga` file, but the
correct device
(AMD GPU in this case) will now show `1` and the incorrect
device shows `0`.
Userspace then picks the right device as well.

Link: https://github.com/robherring/libpciaccess/commit/b2838fb61c3542f107014b285cbda097acae1e12
[1]
Suggested-by: Daniel Dadap <ddadap@xxxxxxxxxx>
Acked-by: Thomas Zimmermann <tzimmermann@xxxxxxx>
Signed-off-by: Mario Limonciello <mario.limonciello@xxxxxxx>
---
   drivers/pci/pci-sysfs.c | 2 +-
   drivers/pci/vgaarb.c    | 8 ++++----
   2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 268c69daa4d57..c314ee1b3f9ac 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1707,7 +1707,7 @@ static umode_t
pci_dev_attrs_are_visible(struct kobject *kobj,
       struct device *dev = kobj_to_dev(kobj);
       struct pci_dev *pdev = to_pci_dev(dev);
   -    if (a == &dev_attr_boot_vga.attr && pci_is_vga(pdev))
+    if (a == &dev_attr_boot_vga.attr && pci_is_display(pdev))
           return a->mode;
         return 0;
diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
index 78748e8d2dbae..63216e5787d73 100644
--- a/drivers/pci/vgaarb.c
+++ b/drivers/pci/vgaarb.c
@@ -1499,8 +1499,8 @@ static int pci_notify(struct
notifier_block *nb, unsigned long action,
         vgaarb_dbg(dev, "%s\n", __func__);
   -    /* Only deal with VGA class devices */
-    if (!pci_is_vga(pdev))
+    /* Only deal with PCI display class devices */
+    if (!pci_is_display(pdev))
           return 0;
         /*
@@ -1546,12 +1546,12 @@ static int __init vga_arb_device_init(void)
         bus_register_notifier(&pci_bus_type, &pci_notifier);
   -    /* Add all VGA class PCI devices by default */
+    /* Add all PCI display class devices by default */
       pdev = NULL;
       while ((pdev =
           pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
                      PCI_ANY_ID, pdev)) != NULL) {
-        if (pci_is_vga(pdev))
+        if (pci_is_display(pdev))
               vga_arbiter_add_pci_device(pdev);
       }

At the very least a non-VGA device should not mark that it decodes
legacy resources, marking the boot VGA device is only a part of what
the VGA arbiter does.  It seems none of the actual VGA arbitration
interfaces have been considered here though.

I still think this is a bad idea and I'm not sure Thomas didn't
withdraw his ack in the previous round[1].  Thanks,

Ah; I didn't realize that was intended to be a withdrawl.
If there's another version of this I'll remove it.

Then let me formally withdraw the A-b.

I think this updated patch doesn't address the concerns raised in the
previous reviews. AFAIU vgaarb is really only about VGA devices.

I missed the earlier version, but wanted to chime in that I concur. vgaarb
is about vga decoding, and modern gpu drivers are trying pretty hard to
disable that since it can cause pain. If we mix in the meaning of "default
display device" into this, we have a mess.

I guess what does make sense is if the kernel exposes its notion of
"default display device", since we do have that in some sense with
simpledrm. At least on systems where simpledrm is a thing, but I think you
need some really old machines for that to not be the case.

Cheers, Sima

Thanks guys.  Let's discard patch 6.  Here's a spin of an approach for userspace that does something similar to what the compositors are doing.
We can iterate on that.

https://gitlab.freedesktop.org/xorg/lib/libpciaccess/-/merge_requests/38

Did you look at my suggestion to extend the kernel's video_is_primary_device()? This would also benefit fbcon. It is architecture specific and currently uses vgaarb on x86. I think it could be extended with the current patch's logic.

Best regards
Thomas


I think patches 1-5 still are valuable though.  So please add reviews to those and we can take those without patch 6 if there is agreement.

--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)