Re: [PATCH v2 5/8] drm: zynqmp_dp: Don't retrain the link in our IRQ

From: Tomi Valkeinen
Date: Thu Mar 21 2024 - 13:26:12 EST


On 21/03/2024 17:52, Sean Anderson wrote:
On 3/20/24 02:53, Tomi Valkeinen wrote:
On 20/03/2024 00:51, Sean Anderson wrote:
Retraining the link can take a while, and might involve waiting for
DPCD reads/writes to complete. This is inappropriate for an IRQ handler.
Just schedule this work for later completion. This is racy, but will be
fixed in the next commit.

You should add the locks first, and use them here, rather than first
adding a buggy commit and fixing it in the next one.

I didn't think I could add the locks first since I only noticed the IRQ
was threaded right before sending out this series. So yeah, we could add
locking, add the workqueue, and then unthread the IRQ.

Signed-off-by: Sean Anderson <sean.anderson@xxxxxxxxx>
---
Actually, on second look this IRQ is threaded. So why do we have a
workqueue for HPD events? Maybe we should make it unthreaded?

Indeed, there's not much work being done in the IRQ handler. I don't know why it's threaded.

We could move the queued work to be inside the threaded irq handler,
but with a quick look, the HPD work has lines like "msleep(100)" (and
that's inside a for loop...), which is probably not a good thing to do
even in threaded irq handler.

Although I'm not sure if that code is good to have anywhere. Why do we
even have such code in the HPD work path... We already got the HPD
interrupt. What does "It takes some delay (ex, 100 ~ 500 msec) to get
the HPD signal with some monitors" even mean...

The documentation for this bit is

| HPD_STATE 0 ro 0x0 Contains the raw state of the HPD pin on the DisplayPort connector.

So I think the idea is to perform some debouncing.

Hmm, it just looks a bit odd to me. It can sleep for a second. And the wording "It takes some delay (ex, 100 ~ 500 msec) to get the HPD signal with some monitors" makes it sound like some kind of a hack...

The docs mention debounce once:

https://docs.amd.com/r/en-US/pg299-v-dp-txss1/Hot-Plug-Detection

But it's not immediately obvious what the SW must do and what's done by the HW. Debounce is not mentioned later, e.g. in the HPD Event Handling. But if debounce is needed, wouldn't it be perhaps in a few milliseconds, instead of hundreds of milliseconds...

zynqmp_dp_bridge_detect() is used for drm_bridge_funcs.detect(), and if the cable is not connected, it'll sleep for 1 second (probably more) until returning not connected. It just doesn't sound correct to me.

Well, it's not part of this patch as such, but related to the amount of time we spend in the interrupt handler (and also the detect()).

Would it be possible to clean up the work funcs a bit (I haven't
looked a the new work func yet), to remove the worst extra sleeps, and
just do all that inside the threaded irq handler?

Probably not, since a HPD IRQ results in link retraining, which can take a while.

But is it any different if you have a workqueue? Isn't a threaded interrupt handler basically the same thing?

Probably at least the zynqmp_dp_hpd_work_func() could be done in the threaded irq just fine, if the insane 1s sleep can be dropped.

Do we need to handle interrupts while either delayed work is being done?

Probably not.

If we do need a delayed work, would just one work be enough which
handles both HPD_EVENT and HPD_IRQ, instead of two?

Maybe, but then we need to determine which pending events we need to
handle. I think since we have only two events it will be easier to just
have separate workqueues.

The less concurrency, the better...Which is why it would be nice to do it all in the threaded irq.

Tomi