Re: [PATCH v1 0/7] MPFS system controller/mailbox fixes

From: Conor Dooley
Date: Fri Feb 17 2023 - 02:34:57 EST


On Thu, Feb 16, 2023 at 10:04:17PM -0600, Jassi Brar wrote:
> On Thu, Feb 16, 2023 at 4:24 PM Conor Dooley <conor@xxxxxxxxxx> wrote:
> > >
> > > > > Secondly, I have a question about what to do if a service does fail, but
> > > > > not due to a timeout - eg the above example where the "new" image for
> > > > > the FPGA is actually older than the one that currently exists.
> > > > > Ideally, if a service fails due to something other than the transaction
> > > > > timing out, I would go and read the status registers to see what the
> > > > > cause of failure was.
> > > > > I could not find a function in the mailbox framework that allows the
> > > > > client to request that sort of information from the client. Trying to
> > > > > do something with the auxiliary bus, or exporting some function to a
> > > > > device specific header seemed like a circumvention of the mailbox
> > > > > framework.
> > > > > Do you think it would be a good idea to implement something like
> > > > > mbox_client_peek_status(struct mbox_chan *chan, void *data) to allow
> > > > > clients to request this type of information?
> > > > >
> > > > .last_tx_done() is supposed to make sure everything is ok.
> > >
> > > Hm, might've explained badly as I think you've misunderstood. Or (see
> > > below) I might have mistakenly thought that last_tx_done() was only meant
> > > to signify that tx was done.
> > >
> > > Anyways, I'll try to clarify.
> > > Some services don't set a status, but whether a status is, or isn't,
> > > set has nothing to do with whether the service has completed.
> > > One service that sets a status is "Authenticate Bitstream". This
> > > service sets a status of 0x0 if the bitstream in question is okay _and_
> > > something that the FPGA can be upgraded to. It returns a failure of 0x18
> > > if the bitstream is valid _but_ is the same as that currently programmed.
> > > (and of course a whole host of other possible errors in-between)
> > >
> > > These statuses, and whether they are a bad outcome or not, is dependant
> > > on the service and I don't think should be handled in the mailbox
> > > controller driver.
> > >
> > > > If the expected status bit is "sometimes not set", that means that bit
> > > > is not the complete status.
> > >
> > > If the "busy" bit goes low, then the transmission must be complete,
> > > there should be no need to check other bits for *completion*, but...
> > >
> > > > You have to check multiple registers to
> > > > detect if and what caused the failure.
> > >
> > > ...maybe I have just misunderstood the role of .last_tx_done(). The
> > > comment in mailbox-controller.h lead me to believe that it was used just
> > > to check if it had been completed.
> > >
> > > Am I allowed to use .last_tx_done() to pass information back to the
> > > mailbox client? If I could, that'd certainly be a nice way to get the
> > > information on whether the service failed etc.
> > >
> > > Hopefully that, plus when you have a chance to look at the code, will
> > > make what I am asking about a little clearer!
> >
> > Just wondering if you've had a chance to look at this again! I know it's
> > missed the merge window this time around but I would like to get this
> > behaviour fixed as other work depends on it.
> >
> My opinion about adding a new api just to accommodate remote f/w's
> behaviour change across updates is still no.
> last_tx_done() is more abstract than you think -- it has to play with
> dozens of behaviors of remotes. So may just wrap your whatever logic,
> of "tx is done", in that.

Okay, that's fine. I'd rather not add a new API, so adding that to
last_tx_done() is fine by me! I just wasn't sure from your earlier reply
if that was okay.

> This query within the patchset threw me off -- I thought you needed
> the new api for the patchset, so I didn't look further.

Ahh, sorry. The query was about improving things to report an accurate
status rather than only timeouts. What's here *works* but is
sub-optimal.

> Looking at it now, I am ok with applying Patches 1,2 and 3. If you want.

Ehh, the whole thing needs to go together to avoid breaking behaviour,
so, given the merge window is next week, I'd rather rework it to report
the actual status from last_tx_done().

Thanks!

Attachment: signature.asc
Description: PGP signature