Re: [PATCH 5.8 298/464] PCI: hv: Fix a timing issue which causes kdump to fail occasionally

From: Sasha Levin
Date: Mon Aug 17 2020 - 20:54:58 EST


On Mon, Aug 17, 2020 at 11:27:49PM +0000, Michael Kelley wrote:
From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> Sent: Monday, August 17, 2020 8:14 AM

From: Wei Hu <weh@xxxxxxxxxxxxx>

[ Upstream commit d6af2ed29c7c1c311b96dac989dcb991e90ee195 ]

Kdump could fail sometime on Hyper-V guest because the retry in
hv_pci_enter_d0() releases child device structures in hv_pci_bus_exit().

Although there is a second asynchronous device relations message sending
from the host, if this message arrives to the guest after
hv_send_resource_allocated() is called, the retry would fail.

Fix the problem by moving retry to hv_pci_probe() and start the retry
from hv_pci_query_relations() call. This will cause a device relations
message to arrive to the guest synchronously; the guest would then be
able to rebuild the child device structures before calling
hv_send_resource_allocated().

Link:
https://lore.kernel.org/linux-hyperv/20200727071731.18516-1-weh@xxxxxxxxxxxxx/
Fixes: c81992e7f4aa ("PCI: hv: Retry PCI bus D0 entry on invalid device state")
Signed-off-by: Wei Hu <weh@xxxxxxxxxxxxx>
[lorenzo.pieralisi@xxxxxxx: fixed a comment and commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx>
Reviewed-by: Michael Kelley <mikelley@xxxxxxxxxxxxx>
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
---
drivers/pci/controller/pci-hyperv.c | 71 +++++++++++++++--------------
1 file changed, 37 insertions(+), 34 deletions(-)


Greg --

Don't backport this patch to 5.8 and earlier. It doesn't break anything,
but it doesn't fully accomplish what was intended either. As such it will
probably need a revision in 5.9. Wei Hu is unavailable for a few days
for personal reasons, so I'm jumping in here on his behalf.

I've dropped it, will wait for the fix.

--
Thanks,
Sasha