RE: [PATCH] soundwire: bus: conditionally recheck UNATTACHED status

From: Liao, Bard
Date: Mon Sep 12 2022 - 12:18:20 EST


> -----Original Message-----
> From: Pierre-Louis Bossart <pierre-louis.bossart@xxxxxxxxxxxxxxx>
> Sent: Monday, September 12, 2022 4:58 AM
> To: Bard Liao <yung-chuan.liao@xxxxxxxxxxxxxxx>; alsa-devel@xxxxxxxxxxxxxxxx;
> vkoul@xxxxxxxxxx; broonie@xxxxxxxxxx
> Cc: vinod.koul@xxxxxxxxxx; Liao, Bard <bard.liao@xxxxxxxxx>; linux-
> kernel@xxxxxxxxxxxxxxx; Richard Fitzgerald <rf@xxxxxxxxxxxxxxxxxxxxx>
> Subject: Re: [PATCH] soundwire: bus: conditionally recheck UNATTACHED status
>
> [top posting]
> I reverted this patch in the SOF tree to use Richard Fitzgerald's
> series, see
>
> https://github.com/thesofproject/linux/pull/3836
>
> I don't think we want this patch upstream, do we?

Agree, let's don't merge this upstream.

>
>
> On 8/30/22 09:42, Bard Liao wrote:
> > From: Pierre-Louis Bossart <pierre-louis.bossart@xxxxxxxxxxxxxxx>
> >
> > In configurations with two amplifiers on the same link, the Intel CI
> > reports occasional enumeration/initialization timeouts during
> > suspend-resume stress tests, where one of the two amplifiers becomes
> > UNATTACHED immediately after being enumerated. This problem was
> > reported both with Maxim and Realtek codecs, which pointed initially
> > to a problem with status handling on the Intel side.
> >
> > The Cadence IP integrated on Intel platforms throws an interrupt when
> > the status changes, and the information is kept with sticky bits until
> > cleared. We initially added more checks to make sure the edge
> > detection did not miss any transition, but that did not improve the
> > results significantly.
> >
> > With the recent addition of the read_ping_status() callback, we were
> > able to show that the status in sticky bits is shown as UNATTACHED
> > even though the PING frames show the problematic device as
> > ATTACHED. That completely breaks the entire logic where we assumed
> > that a peripheral would always re-attach as device0. The resume
> > timeouts make sense in that in those cases, the
> > enumeration/initialization never happens a second time.
> >
> > One possible explanation is that this problem typically happens when a
> > bus clash is reported, so it could very well be that the detection is
> > fooled by a transient electrical issue or conflict between two
> > peripherals.
> >
> > This patch conditionally double-checks the status reported in the
> > sticky bits with the actual PING frame status. If the peripheral
> > reports as attached in PING frames, the early detection based on
> > sticky bits is discarded.
> >
> > Note that this patch only corrects issues of false positives on the
> > manager side.
> >
> > If the peripheral lost and regain sync, then it would report as
> > attached on Device0. A peripheral that would not reset its dev_num
> > would not be compliant with the MIPI specification.
> >
> > BugLink: https://github.com/thesofproject/linux/issues/3638
> > BugLink: https://github.com/thesofproject/linux/issues/3325
> > Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@xxxxxxxxxxxxxxx>
> > Reviewed-by: Rander Wang <rander.wang@xxxxxxxxx>
> > Signed-off-by: Bard Liao <yung-chuan.liao@xxxxxxxxxxxxxxx>
> > ---
> > Hi Vinod,
> >
> > You will need the "ASoC/soundwire: log actual PING status on resume issues"
> > series which is applied at ASoC tree before appling this patch.
> >
> > ---
> > drivers/soundwire/bus.c | 19 +++++++++++++++++++
> > drivers/soundwire/intel.c | 1 +
> > include/linux/soundwire/sdw.h | 3 +++
> > 3 files changed, 23 insertions(+)
> >
> > diff --git a/drivers/soundwire/bus.c b/drivers/soundwire/bus.c
> > index 2772973eebb1..d0d486f07673 100644
> > --- a/drivers/soundwire/bus.c
> > +++ b/drivers/soundwire/bus.c
> > @@ -1767,6 +1767,25 @@ int sdw_handle_slave_status(struct sdw_bus *bus,
> > slave->status != SDW_SLAVE_UNATTACHED) {
> > dev_warn(&slave->dev, "Slave %d state check1:
> UNATTACHED, status was %d\n",
> > i, slave->status);
> > +
> > + if (bus->recheck_unattached && bus->ops-
> >read_ping_status) {
> > + u32 ping_status;
> > +
> > + mutex_lock(&bus->msg_lock);
> > +
> > + ping_status = bus->ops->read_ping_status(bus);
> > +
> > + mutex_unlock(&bus->msg_lock);
> > +
> > + ping_status >>= (i * 2);
> > + ping_status &= 0x3;
> > +
> > + if (ping_status != 0) {
> > + dev_warn(&slave->dev, "Slave %d state
> in PING frame is %d, ignoring earlier detection\n",
> > + i, ping_status);
> > + continue;
> > + }
> > + }
> > sdw_modify_slave_status(slave,
> SDW_SLAVE_UNATTACHED);
> > }
> > }
> > diff --git a/drivers/soundwire/intel.c b/drivers/soundwire/intel.c
> > index 25ec9c272239..0c6e674dbf85 100644
> > --- a/drivers/soundwire/intel.c
> > +++ b/drivers/soundwire/intel.c
> > @@ -1311,6 +1311,7 @@ static int intel_link_probe(struct auxiliary_device
> *auxdev,
> >
> > bus->link_id = auxdev->id;
> > bus->dev_num_ida_min = INTEL_DEV_NUM_IDA_MIN;
> > + bus->recheck_unattached = true;
> >
> > sdw_cdns_probe(cdns);
> >
> > diff --git a/include/linux/soundwire/sdw.h b/include/linux/soundwire/sdw.h
> > index a2b31d25ea27..51ac71984260 100644
> > --- a/include/linux/soundwire/sdw.h
> > +++ b/include/linux/soundwire/sdw.h
> > @@ -892,6 +892,8 @@ struct sdw_master_ops {
> > * @dev_num_ida_min: if set, defines the minimum values for the IDA
> > * used to allocate system-unique device numbers. This value needs to be
> > * identical across all SoundWire bus in the system.
> > + * @recheck_unattached: if set, double-check UNATTACHED status changes
> > + * by reading PING frame status.
> > */
> > struct sdw_bus {
> > struct device *dev;
> > @@ -917,6 +919,7 @@ struct sdw_bus {
> > bool multi_link;
> > int hw_sync_min_links;
> > int dev_num_ida_min;
> > + bool recheck_unattached;
> > };
> >
> > int sdw_bus_master_add(struct sdw_bus *bus, struct device *parent,