Re: [PATCH v2 02/12] device-core: Add dev->lock_class to enable device_lock() lockdep validation

From: Dan Williams
Date: Thu Apr 14 2022 - 15:43:52 EST


On Thu, Apr 14, 2022 at 12:34 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Thu, Apr 14, 2022 at 10:17:13AM -0700, Dan Williams wrote:
>
> > One more sanity check... So in driver subsystems there are cases where
> > a device on busA hosts a topology on busB. When that happens there's a
> > need to set the lock class late in a driver since busA knows nothing
> > about the locking rules of busB.
>
> I'll pretend I konw what you're talking about ;-)
>
> > Since the device has a longer lifetime than a driver when the driver
> > exits it must set dev->mutex back to the novalidate class, otherwise
> > it sets up a use after free of the static lock_class_key.
>
> I'm not following, static storage has infinite lifetime.

Not static storage in a driver module.

modprobe -r fancy_lockdep_using_driver.ko

Any use of device_lock() by the core on a device that a driver in this
module was driving will de-reference a now invalid pointer into
whatever memory was vmalloc'd for the module static data.

>
> > I came up with this and it seems to work, just want to make sure I'm
> > properly using the lock_set_class() API and it is ok to transition
> > back and forth from the novalidate case:
> >
> > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> > index 990b6670222e..32673e1a736d 100644
> > --- a/drivers/cxl/cxl.h
> > +++ b/drivers/cxl/cxl.h
> > @@ -405,6 +405,29 @@ struct cxl_nvdimm_bridge
> > *cxl_find_nvdimm_bridge(struct cxl_nvdimm *cxl_nvd);
> > #define __mock static
> > #endif
> >
> > +#ifdef CONFIG_PROVE_LOCKING
> > +static inline void cxl_lock_reset_class(void *_dev)
> > +{
> > + struct device *dev = _dev;
> > +
> > + lock_set_class(&dev->mutex.dep_map, "__lockdep_no_validate__",
> > + &__lockdep_no_validate__, 0, _THIS_IP_);
> > +}
> > +
> > +static inline int cxl_lock_set_class(struct device *dev, const char *name,
> > + struct lock_class_key *key)
> > +{
> > + lock_set_class(&dev->mutex.dep_map, name, key, 0, _THIS_IP_);
> > + return devm_add_action_or_reset(dev, cxl_lock_reset_class, dev);
> > +}
> > +#else
> > +static inline int cxl_lock_set_class(struct device *dev, const char *name,
> > + struct lock_class_key *key)
> > +{
> > + return 0;
> > +}
> > +#endif
>
> Under the assumption that the lock is held (lock_set_class() will
> actually barf if @lock isn't held) this should indeed work as expected

Nice.

> (although I think you got the @name part 'wrong', I think that's
> canonically something like "&dev->mutex" or something).

Ah, yes, I got it wrong for restoring the same name that results from
lockdep_set_novalidate_class(), will fix.