Re: [PATCH 05/15] drm/panfrost: use spinlock instead of atomic

From: Steven Price
Date: Fri May 29 2020 - 08:47:22 EST


On 29/05/2020 13:35, ClÃment PÃron wrote:
Hi Robin,

On Fri, 29 May 2020 at 14:20, Robin Murphy <robin.murphy@xxxxxxx> wrote:

On 2020-05-10 17:55, ClÃment PÃron wrote:
Convert busy_count to a simple int protected by spinlock.

A little more reasoning might be nice.

I have follow the modification requested for lima devfreq and clearly
don't have any argument to switch to spinlock.

The Lima Maintainer asked to change witht the following reason :
"Better make this count a normal int which is also protected by the spinlock,
because current implementation can't protect atomic ops for state change
and busy idle check and we are using spinlock already"


Signed-off-by: ClÃment PÃron <peron.clem@xxxxxxxxx>
---
[...]
diff --git a/drivers/gpu/drm/panfrost/panfrost_devfreq.h b/drivers/gpu/drm/panfrost/panfrost_devfreq.h
index 0697f8d5aa34..e6629900a618 100644
--- a/drivers/gpu/drm/panfrost/panfrost_devfreq.h
+++ b/drivers/gpu/drm/panfrost/panfrost_devfreq.h
@@ -4,6 +4,7 @@
#ifndef __PANFROST_DEVFREQ_H__
#define __PANFROST_DEVFREQ_H__

+#include <linux/spinlock.h>
#include <linux/ktime.h>

struct devfreq;
@@ -14,10 +15,17 @@ struct panfrost_device;
struct panfrost_devfreq {
struct devfreq *devfreq;
struct thermal_cooling_device *cooling;
+
ktime_t busy_time;
ktime_t idle_time;
ktime_t time_last_update;
- atomic_t busy_count;
+ int busy_count;
+ /*
+ * Protect busy_time, idle_time, time_last_update and busy_count
+ * because these can be updated concurrently, for example by the GP
+ * and PP interrupts.
+ */

Nit: this comment is clearly wrong, since we only have Job, GPU and MMU
interrupts here. I guess if there is a race it would be between
submission/completion/timeout on different job slots.

It's copy/paste from lima I will update it,

Lima ('Utgard') has separate units for geometry and pixel processing (GP/PP). For Panfrost ('Midgard'/'Bifrost') we don't have that separation, however there are multiple job slots. which are implemented as multiple DRM schedulers. So the same fix is appropriate, but clearly I missed this comment because it's referring to GP/PP which don't exist for Midgard/Bifrost.


Given that, should this actually be considered a fix for 9e62b885f715
("drm/panfrost: Simplify devfreq utilisation tracking")?

I can't say if it can be considered as a fix, I didn't see any
improvement on my board before and after this patch.
I'm still facing some issue and didn't have time to fully investigate it.

Technically this is a fix - there's a small race which could cause the devfreq information to become corrupted. However it would resolve itself on the next devfreq interval when panfrost_devfreq_reset() is called. So the impact is very minor (devfreq gets some bogus figures). The important variable (busy_count) was already an atomic so won't be affected.

Steve

Thanks for you review,



Robin.