Re: [PATCH] interconnect: avoid memory allocation when 'icc_bw_lock' is held
From: Johan Hovold
Date: Wed Jun 18 2025 - 08:51:01 EST
[ +CC: Rob ]
On Tue, Jun 03, 2025 at 10:01:31AM +0000, Bryan O'Donoghue wrote:
> On 03/06/2025 10:15, Gabor Juhos wrote:
> > 2025. 05. 30. 11:16 keltezéssel, Bryan O'Donoghue írta:
> >> On 29/05/2025 15:46, Gabor Juhos wrote:
> >>> The 'icc_bw_lock' mutex is introduced in commit af42269c3523
> >>> ("interconnect: Fix locking for runpm vs reclaim") in order
> >>> to decouple serialization of bw aggregation from codepaths
> >>> that require memory allocation.
> >>>
> >>> However commit d30f83d278a9 ("interconnect: core: Add dynamic
> >>> id allocation support") added a devm_kasprintf() call into a
> >>> path protected by the 'icc_bw_lock' which causes this lockdep
> >>> warning (at least on the IPQ9574 platform):
> >>> Move the memory allocation part of the code outside of the protected
> >>> path to eliminate the warning. Also add a note about why it is moved
> >>> to there,
> >>> @@ -1023,6 +1023,16 @@ void icc_node_add(struct icc_node *node, struct
> >>> icc_provider *provider)
> >>> return;
> >>>
> >>> mutex_lock(&icc_lock);
> >>> +
> >>> + if (node->id >= ICC_DYN_ID_START) {
> >>> + /*
> >>> + * Memory allocation must be done outside of codepaths
> >>> + * protected by icc_bw_lock.
> >>> + */
> >>> + node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
> >>> + node->name, dev_name(provider->dev));
> >>> + }
> >>> +
> >>> mutex_lock(&icc_bw_lock);
> >>>
> >>> node->provider = provider;
> >>> @@ -1038,10 +1048,6 @@ void icc_node_add(struct icc_node *node, struct
> >>> icc_provider *provider)
> >>> node->avg_bw = node->init_avg;
> >>> node->peak_bw = node->init_peak;
> >>>
> >>> - if (node->id >= ICC_DYN_ID_START)
> >>> - node->name = devm_kasprintf(provider->dev, GFP_KERNEL, "%s@%s",
> >>> - node->name, dev_name(provider->dev));
> >>> -
> >>> if (node->avg_bw || node->peak_bw) {
> >>> if (provider->pre_aggregate)
> >>> provider->pre_aggregate(node);
> >> The locking in this code is a mess.
> >>
> >> Which data-structures does icc_lock protect node* pointers I think and which
> >> data-structures does icc_bw_lock protect - "bw" data structures ?
> >>
> >> Hmm.
> >>
> >> Looking at this code I'm not sure at all what icc_lock was introduced to do.
> >
> > Initially, only the 'icc_lock' mutex was here, and that protected 'everything'.
> > The 'icc_bw_lock' has been introduced later by commit af42269c3523
> > ("interconnect: Fix locking for runpm vs reclaim") as part of the
> > "drm/msm+PM+icc: Make job_run() reclaim-safe" series [1].
> >
> > Here is the reason copied from the original commit message:
> >
> > "For cases where icc_bw_set() can be called in callbaths that could
> > deadlock against shrinker/reclaim, such as runpm resume, we need to
> > decouple the icc locking. Introduce a new icc_bw_lock for cases where
> > we need to serialize bw aggregation and update to decouple that from
> > paths that require memory allocation such as node/link creation/
> > destruction."
>
> Right but reading this code.
>
> icc_set_bw();
> icc_lock_bw - protects struct icc_node *
>
> icc_put();
> icc_lock - locks
> icc_lock_bw -locks directly after protects struct icc_node *
>
> icc_node_add current:
> icc_lock - locks
> icc_lock_bw - locks
> node->name = devm_kasprintf();
>
> After your change
>
> icc_node_add current:
> icc_lock - locks
> node->name = devm_kasprintf();
> icc_lock_bw - locks
> owns node->provider - or whatever
>
> And this is what is prompting my question. Which locks own which data here ?
>
> I think we should sort that out, either by removing one of the locks or
> by at the very least documenting beside the mutex declarations which
> locks protect what.
Feel free to discuss that with Rob who added the icc_lock_bw, but it's
unrelated to the regression at hand (and should not block fixing it).
Allocations cannot be done while holding the icc_lock_bw, and this fix
is correct in moving the allocation (also note that the node has not
been added yet).
Johan