Re: [Bug 206175] Fedora >= 5.4 kernels instantly freeze on boot without producing any display output

From: Robin Murphy
Date: Wed Mar 11 2020 - 12:15:37 EST


On 11/03/2020 4:02 pm, Artem S. Tashkinov wrote:
On 3/11/20 3:47 PM, Christoph Hellwig wrote:
And actually one more idea after looking at what slab interactions
could exist. platform_device_register_full frees the dma_mask
unconditionally, even if it didn't allocated it, which might lead
to weird memory corruption if we hit the failure path. So let's try
something like this, replacing the earlier patch in that file.

diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index b230beb6ccb4..04080a8d94e2 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -632,19 +632,6 @@ struct platform_device *platform_device_register_full(
ÂÂÂÂÂ pdev->dev.of_node_reused = pdevinfo->of_node_reused;

ÂÂÂÂÂ if (pdevinfo->dma_mask) {
-ÂÂÂÂÂÂÂ /*
-ÂÂÂÂÂÂÂÂ * This memory isn't freed when the device is put,
- * I don't have a nice idea for that though. Conceptually
-ÂÂÂÂÂÂÂÂ * dma_mask in struct device should not be a pointer.
-ÂÂÂÂÂÂÂÂ * See http://thread.gmane.org/gmane.linux.kernel.pci/9081
-ÂÂÂÂÂÂÂÂ */
-ÂÂÂÂÂÂÂ pdev->dev.dma_mask =
-ÂÂÂÂÂÂÂÂÂÂÂ kmalloc(sizeof(*pdev->dev.dma_mask), GFP_KERNEL);
-ÂÂÂÂÂÂÂ if (!pdev->dev.dma_mask)
-ÂÂÂÂÂÂÂÂÂÂÂ goto err;
-
-ÂÂÂÂÂÂÂ kmemleak_ignore(pdev->dev.dma_mask);
-
ÂÂÂÂÂÂÂÂÂ *pdev->dev.dma_mask = pdevinfo->dma_mask;
ÂÂÂÂÂÂÂÂÂ pdev->dev.coherent_dma_mask = pdevinfo->dma_mask;
ÂÂÂÂÂ }
@@ -670,7 +657,6 @@ struct platform_device *platform_device_register_full(
ÂÂÂÂÂ if (ret) {
 err:
ÂÂÂÂÂÂÂÂÂ ACPI_COMPANION_SET(&pdev->dev, NULL);
-ÂÂÂÂÂÂÂ kfree(pdev->dev.dma_mask);
ÂÂÂÂÂÂÂÂÂ platform_device_put(pdev);
ÂÂÂÂÂÂÂÂÂ return ERR_PTR(ret);
ÂÂÂÂÂ }


With this patch the system works (I haven't created an initrd, so it
doesn't completely boot and panics on not being able to mount root fs
but that's expected).

Yup, a few lines earlier in the log you can see the wdat_wdt driver failing in platform_device_add(), which since it called into platform_device_register_full() with pdevinfo.dma_mask = 0, will have unwound into that kfree() of pdev.dma_mask corrupting the heap.

Robin.