Re: [PATCH] irqchip/gic-v3-its: Reset each ITS's BASERn register before probe

From: Marc Zyngier
Date: Sat Jan 22 2022 - 10:55:46 EST


On Tue, 18 Jan 2022 12:21:13 +0000,
Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx> wrote:
>
> On Mon, Jan 17, 2022 at 04:19:10PM +0000, Marc Zyngier wrote:
> > A recent bug report outlined that the way GICv4.1 is handled across
> > kexec is pretty bad. We can end-up in a situation where ITSs share
> > memory (this is the case when SVPEPT==1) and reprogram the base
>
> SVPET

Too many acronyms... ;-)

>
> > registers, creating a situation where ITSs that are part of a given
> > affinity group see different pointers. Which is illegal. Boo.
> >
> > In order to restore some sanity, reset the BASERn registers to 0
> > *before* probing any ITS. Although this isn't optimised at all,
> > this is only a once-per-boot cost, which shouldn't show up on
> > anyone's radar.
> >
> > Cc: Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx>
> > Cc: Jay Chen <jkchen@xxxxxxxxxxxxxxxxx>
> > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx>
> > Link: https://lore.kernel.org/r/20211216190315.GA14220@lpieralisi
> > ---
> > drivers/irqchip/irq-gic-v3-its.c | 106 +++++++++++++++++++++++++------
> > 1 file changed, 87 insertions(+), 19 deletions(-)
>
> Thank you for the patch Marc.
>
> > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> > index ee83eb377d7e..9d68afe9fa86 100644
> > --- a/drivers/irqchip/irq-gic-v3-its.c
> > +++ b/drivers/irqchip/irq-gic-v3-its.c
> > @@ -4856,6 +4856,38 @@ static struct syscore_ops its_syscore_ops = {
> > .resume = its_restore_enable,
> > };
> >
> > +static void __init __iomem *its_map_one(struct resource *res, int *err)
> > +{
> > + void __iomem *its_base;
> > + u32 val;
> > +
> > + its_base = ioremap(res->start, SZ_64K);
> > + if (!its_base) {
> > + pr_warn("ITS@%pa: Unable to map ITS registers\n", &res->start);
> > + *err = -ENOMEM;
> > + return NULL;
> > + }
> > +
> > + val = readl_relaxed(its_base + GITS_PIDR2) & GIC_PIDR2_ARCH_MASK;
> > + if (val != 0x30 && val != 0x40) {
> > + pr_warn("ITS@%pa: No ITS detected, giving up\n", &res->start);
> > + *err = -ENODEV;
> > + goto out_unmap;
> > + }
> > +
> > + *err = its_force_quiescent(its_base);
> > + if (*err) {
> > + pr_warn("ITS@%pa: Failed to quiesce, giving up\n", &res->start);
> > + goto out_unmap;
> > + }
> > +
> > + return its_base;
> > +
> > +out_unmap:
> > + iounmap(its_base);
> > + return NULL;
> > +}
> > +
> > static int its_init_domain(struct fwnode_handle *handle, struct its_node *its)
> > {
> > struct irq_domain *inner_domain;
> > @@ -4963,29 +4995,14 @@ static int __init its_probe_one(struct resource *res,
> > {
> > struct its_node *its;
> > void __iomem *its_base;
> > - u32 val, ctlr;
> > u64 baser, tmp, typer;
> > struct page *page;
> > + u32 ctlr;
> > int err;
> >
> > - its_base = ioremap(res->start, SZ_64K);
> > - if (!its_base) {
> > - pr_warn("ITS@%pa: Unable to map ITS registers\n", &res->start);
> > - return -ENOMEM;
> > - }
> > -
> > - val = readl_relaxed(its_base + GITS_PIDR2) & GIC_PIDR2_ARCH_MASK;
> > - if (val != 0x30 && val != 0x40) {
> > - pr_warn("ITS@%pa: No ITS detected, giving up\n", &res->start);
> > - err = -ENODEV;
> > - goto out_unmap;
> > - }
> > -
> > - err = its_force_quiescent(its_base);
> > - if (err) {
> > - pr_warn("ITS@%pa: Failed to quiesce, giving up\n", &res->start);
> > - goto out_unmap;
> > - }
> > + its_base = its_map_one(res, &err);
> > + if (!its_base)
> > + return err;
> >
> > pr_info("ITS %pR\n", res);
> >
> > @@ -5248,6 +5265,22 @@ static int its_cpu_memreserve_lpi(unsigned int cpu)
> > return ret;
> > }
> >
> > +/* Mark all the BASER registers as invalid before they get reprogrammed */
> > +static void __init its_reset_one(struct resource *res)
> > +{
> > + void __iomem *its_base;
> > + int err, i;
> > +
> > + its_base = its_map_one(res, &err);
> > + if (!its_base)
> > + return;
> > +
> > + for (i = 0; i < GITS_BASER_NR_REGS; i++)
> > + gits_write_baser(0, its_base + GITS_BASER + (i << 3));
> > +
> > + iounmap(its_base);
> > +}
> > +
> > static const struct of_device_id its_device_id[] = {
> > { .compatible = "arm,gic-v3-its", },
> > {},
> > @@ -5258,6 +5291,22 @@ static int __init its_of_probe(struct device_node *node)
> > struct device_node *np;
> > struct resource res;
> >
> > + /*
> > + * Make sure *all* the ITS are reset before we probe any, as
> > + * they may be sharing memory. Don't bother warning on the
> > + * first iteration, as any error will be caught on the second
> > + * one.
> > + */
> > + for (np = of_find_matching_node(node, its_device_id); np;
> > + np = of_find_matching_node(np, its_device_id)) {
> > + if (!of_device_is_available(np) ||
> > + !of_property_read_bool(np, "msi-controller") ||
> > + of_address_to_resource(np, 0, &res))
> > + continue;
> > +
> > + its_reset_one(&res);
>
> This all makes sense to me, only one question (commenting on the DT
> path but that's obviously valid for ACPI below as well).
>
> We are not checking whether the reset was successful. Is it remotely
> possible that we fail to quiesce an ITS (and reset its GITS_BASER)
> while being able to probe other ITSes within the same affinity ?

I guess it is always possible if an ITS is somewhat wedged.

> Overthinking, agreed, just asking if we should not just bail
> out whenever *any* ITS reset fails (though that looks a tad harsh,
> we don't know at this stage whether some ITSes are siblings).

I'm all for that. If an ITS doesn't reply anymore, who knows what it
is capable of. I'll respin the patch with that in mind.

Thanks,

M.

--
Without deviation from the norm, progress is not possible.