Re: [RFC][PATCH] Reset PCIe devices to address DMA problem on kdumpwith iommu

From: Takao Indoh
Date: Tue Aug 07 2012 - 05:06:57 EST


(2012/08/07 5:39), Vivek Goyal wrote:
> On Mon, Aug 06, 2012 at 01:30:47PM +0900, Takao Indoh wrote:
>> Hi Vivek,
>>
>> (2012/08/03 20:46), Vivek Goyal wrote:
>>> On Fri, Aug 03, 2012 at 08:24:31PM +0900, Takao Indoh wrote:
>>>> Hi all,
>>>>
>>>> This patch adds kernel parameter "reset_pcie_devices" which resets PCIe
>>>> devices at boot time to address DMA problem on kdump with iommu. When
>>>> this parameter is specified, a hot reset is triggered on each PCIe root
>>>> port and downstream port to reset its downstream endpoint.
>>>
>>> Hi Takao,
>>>
>>> Why not use existing "reset_devices" parameter instead of introducing
>>> a new one?
>>
>> "reset_devices" is used for each driver to reset their own device, and
>> this patch resets all devices forcibly, so I thought they were different
>> things.
>
> Yes reset_devices currently is used for driver to reset its device. I
> thought one could very well extend its reach to reset pci express devices
> at bus level.
>
> Having them separate is not going to be much useful from kdump
> perspective. We will end up passing both reset_devices and
> reset_pcie_devices to second kernel whill lead to bus level reset as well
> as device level reset.
>
> Ideal situation would be that somehow detect that bus level reset has been
> done and skip device level reset (assuming bus level reset obviates the
> need of device level reset, please correct me if that's not the case).
>
> After pcie reset, can we store the state in a variable and drivers can
> use that variable to check if PCIe level reset was done or not. If yes,
> skip device level reset (Assuming driver knows that device is on a
> PCIe slot).
>
> In that case we will not have to introduce new kernel command line, and
> also avoid double reset?

Actually I'm not sure whether the driver does not need to do their reset after
bus level reset, but I agree with you, now I'm thinking that using reset_devices
is better rather than adding narrow one which is limited to PCI express, otherwise
we may have to add new parameter every time when adding new reset method, such as
reset_pcie_devices, reset_pci_legacy_devices, etc.

Thanks,
Takao Indoh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/