Re: [PATCH v3 11/14] perf/hw_breakpoint: Reduce contention with large number of tasks

From: Marco Elver
Date: Mon Aug 29 2022 - 05:39:05 EST


On Mon, 29 Aug 2022 at 10:38, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Wed, Aug 17, 2022 at 03:14:54PM +0200, Marco Elver wrote:
> > On Wed, 17 Aug 2022 at 15:03, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > >
> > > On Mon, Jul 04, 2022 at 05:05:11PM +0200, Marco Elver wrote:
> > > > +static bool bp_constraints_is_locked(struct perf_event *bp)
> > > > +{
> > > > + struct mutex *tsk_mtx = get_task_bps_mutex(bp);
> > > > +
> > > > + return percpu_is_write_locked(&bp_cpuinfo_sem) ||
> > > > + (tsk_mtx ? mutex_is_locked(tsk_mtx) :
> > > > + percpu_is_read_locked(&bp_cpuinfo_sem));
> > > > +}
> > >
> > > > @@ -426,18 +521,28 @@ static int modify_bp_slot(struct perf_event *bp, u64 old_type, u64 new_type)
> > > > */
> > > > int dbg_reserve_bp_slot(struct perf_event *bp)
> > > > {
> > > > - if (mutex_is_locked(&nr_bp_mutex))
> > > > + int ret;
> > > > +
> > > > + if (bp_constraints_is_locked(bp))
> > > > return -1;
> > > >
> > > > - return __reserve_bp_slot(bp, bp->attr.bp_type);
> > > > + /* Locks aren't held; disable lockdep assert checking. */
> > > > + lockdep_off();
> > > > + ret = __reserve_bp_slot(bp, bp->attr.bp_type);
> > > > + lockdep_on();
> > > > +
> > > > + return ret;
> > > > }
> > > >
> > > > int dbg_release_bp_slot(struct perf_event *bp)
> > > > {
> > > > - if (mutex_is_locked(&nr_bp_mutex))
> > > > + if (bp_constraints_is_locked(bp))
> > > > return -1;
> > > >
> > > > + /* Locks aren't held; disable lockdep assert checking. */
> > > > + lockdep_off();
> > > > __release_bp_slot(bp, bp->attr.bp_type);
> > > > + lockdep_on();
> > > >
> > > > return 0;
> > > > }
> > >
> > > Urggghhhh... this is horrible crap. That is, the current code is that
> > > and this makes it worse :/
> >
> > Heh, yes and when I looked at it I really wanted to see if it can
> > change. But from what I can tell, when the kernel debugger is being
> > attached, the kernel does stop everything it does and we need the
> > horrible thing above to not deadlock. And these dbg_ functions are not
> > normally used, so I decided to leave it as-is. Suggestions?
>
> What context is this ran in? NMI should already have lockdep disabled.

kgdb can enter via kgdb_nmicall*() but also via
kgdb_handle_exception(), which isn't for NMI.