Re: [PATCH RFC] drivers/core: Replace lockdep_set_novalidate_class() with unique class keys

From: Alan Stern
Date: Sun Feb 12 2023 - 15:19:22 EST


On Sun, Feb 12, 2023 at 02:14:02PM -0500, Kent Overstreet wrote:
> On Sun, Feb 12, 2023 at 10:23:44AM -0500, Alan Stern wrote:
> > On Sat, Feb 11, 2023 at 10:10:23PM -0500, Kent Overstreet wrote:
> > > So IMO the more correct way to do this would be to change
> > > device_initialize() to __device_initialize(), have it take a
> > > lock_class_key as a parameter, and then use __mutex_init() instead of
> > > mutex_init().
> >
> > Yes, maybe. The increase in code size might outweigh the space saved.
>
> Where would the increase in code size come from?

Maybe I didn't understand your suggestion. Did you mean that all
callers of device_initialize() (or whatever) should be able to specify
their own lock_class_key? Or were you just trying to avoid the single
static allocation of a lock_class_key in device_initialize() done as a
side-effect of the mutex_init() call?

> And whatever effect
> would only be when lockdep is enabled, so not a concern.

Not if a new function is created (i.e., __device_initialize()).

> But consider that the name of a lock as registered with lockdep really
> should correspond to a source code location - i.e. it should be
> something you can grep for. (We should probably consider adding file and
> line number to that string, since current names are not unambiguous).
>
> Whereas in your pass, you're calling lockdep_set_class(), which
> generates a class name via stringifcation - with the same name every
> time.
>
> Definitely _don't_ do that. With your patch, the generated lockdep
> traces will be useless.

I'll revise the patch to use the device's name for the class. While it
may not be globally unique, it should be sufficiently specific.

(Device names are often set after the device is initialized. Does
lockdep mind if a lock_class_key's name is changed after it has been
registered?)

> > > But let's think about this more. Will there ever be situations where
> > > lock ordering is dependent on what hardware is plugged into what, or
> > > what hardware is plugged in first?
> >
> > Device lock ordering is always dependent on what hardware is plugged
> > into what. However, I'm not aware of any situations where, given two
> > different kinds of hardware, either one could be plugged into the other.
> > Such things may exist but I can't think of any examples.
>
> Different brands of hubs?

As far as the kernel is concerned they wouldn't be _different kinds_ of
hardware; they would both be hubs.

> Lots of things have hubs embedded into them these days. 15 years ago I
> had an Apple keyboard with an embedded hub.

Apple keyboards get treated as two logically separate pieces of
hardware: a hub and a keyboard. The fact that they are packaged as a
single unit is irrelevant.

> > On the other hand, there are obvious cases where two pieces of the
> > _same_ kind of hardware can be plugged together in either order. USB
> > hubs are a good example.
> >
> > Consider the possibility that a driver might want to lock all of a
> > device's children at once. (I don't know if this ever happens, but it
> > might.) Provided it acquires the parent device's lock first, this is
> > utterly safe no matter what order the children are locked in. Try
> > telling that to lockdep! In the end, we'd probably have to enforce a
> > single fixed ordering.
>
> The way you speak of hypotheticals and possibilities makes me think that
> the actual locking rules are not ironed out at all :)

You're right. There are no explicitly documented rules for device
locking as far as I'm aware. Each subsystem handles its own locking
independently (but self-consistently, we hope). That's one of the
reasons that creating lockdep rules for devices is so difficult.

The business about not locking a parent if you already own the child's
lock is perhaps the only universal -- and I don't know that it's written
down anywhere.

> The patch I posted would be an improvement over the current situation,
> because it'd get you checking w.r.t. other lock types - but with that
> you would still have to have your own deadlock avoidance strategy, and
> you'd have to be _really_ clear on what it is and how you know that
> you're getting it right - you're still opting out of checking.

Same with the patch I posted, except that it opts back into checking.

> I think you should really be investigating wait/wound mutexes here.

At this stage, converting would be most impractical. And I don't think
it's really needed.

Alan Stern