RE: [EXT] Re: [PATCH v2 1/1] PCI: layerscape: Add power management support

From: Frank Li
Date: Tue Mar 21 2023 - 17:39:53 EST




> -----Original Message-----
> From: Bjorn Helgaas <helgaas@xxxxxxxxxx>
> Sent: Tuesday, March 21, 2023 3:59 PM
> To: Frank Li <frank.li@xxxxxxx>
> Cc: bhelgaas@xxxxxxxxxx; Leo Li <leoyang.li@xxxxxxx>; dl-linux-imx <linux-
> imx@xxxxxxx>; devicetree@xxxxxxxxxxxxxxx;
> gustavo.pimentel@xxxxxxxxxxxx; kw@xxxxxxxxx; linux-arm-
> kernel@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-
> pci@xxxxxxxxxxxxxxx; lorenzo.pieralisi@xxxxxxx; M.H. Lian
> <minghuan.lian@xxxxxxx>; Mingkai Hu <mingkai.hu@xxxxxxx>;
> robh+dt@xxxxxxxxxx; Roy Zang <roy.zang@xxxxxxx>;
> shawnguo@xxxxxxxxxx; Z.Q. Hou <zhiqiang.hou@xxxxxxx>
> Subject: [EXT] Re: [PATCH v2 1/1] PCI: layerscape: Add power management
> support
>
> Caution: EXT Email
>
> On Tue, Mar 21, 2023 at 12:02:20PM -0400, Frank Li wrote:
> > From: Hou Zhiqiang <Zhiqiang.Hou@xxxxxxx>
> >
> > Add PME_Turn_Off/PME_TO_Ack handshake sequence to PCIe devices,
> such as
> > NVME or wifi module, and finally put the PCIe controller into D3 state
> > after the L2/L3 ready state transition process completion.
> >
> > However, it's important to note that not all devices may be able to
> > tolerate the PME_Turn_Off command. In general, fixed PCIe devices
> > connected to Layerscape, such as NXP wifi devices, are able to handle
> > this command.
>
> I know this paragraph is here because I asked whether all PCIe devices
> could tolerate PME_Turn_Off. I don't know much about that level of
> the protocol, but it does look to me like PME_Turn_Off is required,
> e.g., PCIe r6.0, sec 5.3.3.2.1, 5.3.3.4.
>
> So I'm not sure this paragraph adds anything useful. If the spec
> requires it, this paragraph is like saying "it's important to note
> that some PCIe devices may not follow the spec," which is pointless.
>
> This functionality results in any downstream devices being put in
> D3cold, right? I think that *would* be worth mentioning. There are a
> few cases where we try to avoid putting devices in D3cold, e.g.,
> no_d3cold, and I suspect this functionality would put them in D3cold
> regardless of no_d3cold. Those are corner cases that you would
> probably never see on your platform, so a brief mention here is
> probably enough.
>
> > +static void ls_pcie_set_dstate(struct ls_pcie *pcie, u32 dstate)
> > +{
> > + struct dw_pcie *pci = pcie->pci;
> > + u8 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_PM);
> > + u32 val;
> > +
> > + val = dw_pcie_readw_dbi(pci, offset + PCI_PM_CTRL);
> > + val &= ~PCI_PM_CTRL_STATE_MASK;
> > + val |= dstate;
> > + dw_pcie_writew_dbi(pci, offset + PCI_PM_CTRL, val);
>
> Is this a power management register for the *Root Port*, i.e., as
> defined by PCIe r6.0 sec 7.5.2?
>
> Or is it a similar register for the *Root Complex* as a whole that is
> defined by a Layerscape or DesignWare spec and coincidentally uses the
> same Capability ID and control register layout as the PCIe one?

I think it is root port. Does linux framework can do that for it automatically?
Or need call pci_set_power_state here instead of write register directly?

>
> The Root Port programming model is defined by the PCIe spec. Things
> like .send_turn_off_message() and .exit_from_l2() are clearly part of
> the Root *Complex* programming model that is device-specific and not
> defined by the PCIe spec.
>
> I'm asking about ls_pcie_set_dstate() because it's written using the
> PCIe constants (PCI_CAP_ID_PM, PCI_PM_CTRL, etc) but it's mixed in
> with these Root Complex things that are *not* part of the PCIe spec.
>
> > +static bool ls_pcie_pm_supported(struct ls_pcie *pcie)
> > +{
> > + if (!dw_pcie_link_up(pcie->pci)) {
> > + dev_dbg(pcie->pci->dev, "Endpoint isn't present\n");
> > + return false;
> > + }
> > +
> > + return pcie->pm_support;
> > +}
> > +
> > +#ifdef CONFIG_PM_SLEEP
> > +static int ls_pcie_suspend_noirq(struct device *dev)
> > +{
> > + struct ls_pcie *pcie = dev_get_drvdata(dev);
> > + struct dw_pcie *pci = pcie->pci;
> > + u32 val;
> > + int ret;
> > +
> > + if (!ls_pcie_pm_supported(pcie))
> > + return 0;
> > +
> > + pcie->drvdata->pm_ops->send_turn_off_message(pcie);
> > +
> > + /* 10ms timeout to check L2 ready */
> > + ret = readl_poll_timeout(pci->dbi_base + PCIE_PORT_DEBUG0,
> > + val, LS_PCIE_IS_L2(val), 100, 10000);
> > + if (ret) {
> > + dev_err(dev, "PCIe link enter L2 timeout! ltssm = 0x%x\n", val);
> > + return ret;
> > + }
> > +
> > + ls_pcie_set_dstate(pcie, 0x3);
> > +
> > + return 0;
> > +}
> > +
> > +static int ls_pcie_resume_noirq(struct device *dev)
> > +{
> > + struct ls_pcie *pcie = dev_get_drvdata(dev);
> > + struct dw_pcie *pci = pcie->pci;
> > + int ret;
> > +
> > + if (!ls_pcie_pm_supported(pcie))
> > + return 0;
>
> How does this work? You're checking whether the link is up *here*,
> and if it's already up, you go on below to (I guess) set the PCIe
> controller to D0, call dw_pcie_setup_rc() and dw_pcie_wait_for_link().
> Most drivers call dw_pcie_setup_rc() *before* starting the link.
>
> It looks like when you call ls_pcie_pm_supported() here,
> dw_pcie_link_up() should always return false because the link isn't up
> so it looks like there's no downstream device. But I must be missing
> something because if that were the case you would never wake anything
> up below.
>
> > + ls_pcie_set_dstate(pcie, 0x0);
> > +
> > + pcie->drvdata->pm_ops->exit_from_l2(pcie);
> > +
> > + ret = ls_pcie_host_init(&pci->pp);
> > + if (ret) {
> > + dev_err(dev, "PCIe host init failed! ret = 0x%x\n", ret);
> > + return ret;
> > + }
> > +
> > + dw_pcie_setup_rc(&pci->pp);
> > +
> > + ret = dw_pcie_wait_for_link(pci);
> > + if (ret) {
> > + dev_err(dev, "Wait link up timeout! ret = 0x%x\n", ret);
> > + return ret;
> > + }
> > +
> > + return 0;
> > +}