Re: [PATCH v2 20/20] jump_label: RFC - tolerate toggled state

From: Jason Baron
Date: Thu Feb 16 2023 - 12:21:35 EST




On 1/17/23 6:57 AM, Peter Zijlstra wrote:
On Fri, Jan 13, 2023 at 12:30:16PM -0700, Jim Cromie wrote:
__jump_label_patch currently will "crash the box" if it finds a
jump_entry not as expected. ISTM this overly harsh; it doesn't
distinguish between "alternate/opposite" state, and truly
"insane/corrupted".

The "opposite" (but well-formed) state is a milder mis-initialization
problem, and some less severe mitigation seems practical. ATM this
just warns about it; a range/enum of outcomes: warn, crash, silence,
ok, fixup-continue, etc, are possible on a case-by-case basis.

Ive managed to create this mis-initialization condition with
test_dynamic_debug.ko & _submod.ko. These replicate DRM's regression
on DRM_USE_DYNAMIC_DEBUG=y; drm.debug callsites in drivers/helpers
(dependent modules) are not enabled along with those in drm.ko itself.


Ive hit this case a few times, but havent been able to isolate the
when and why.

warn-only is something of a punt, and I'm still left with remaining
bugs which are likely related; I'm able to toggle the p-flag on
callsites in the submod, but their enablement still doesn't yield
logging activity.

Right; having been in this is state is bad since it will generate
inconsistent code-flow. Full on panic *might* not be warranted (as it
does for corrupted text) but it is still a fairly bad situation -- so
I'm not convinced we want to warn and carry on.

It would be really good to figure out why the site was skipped over and
got out of skew.

Given it's all module stuff, the 'obvious' case would be something like
a race between adding the new sites and flipping it, but I'm not seeing
how -- things are rather crudely serialized by jump_label_mutex.

Indeed, looks like dynamic debug introduces a new path in this series to that tries to toggle the jump label sites before they have been initialized, which ends up with the jump label key enabled but only some of the sites in the correct state. Then when the key is subsequently disabled it finds some in the wrong state. I just posted a patch for dynamic debug to use the module callback notifiers so it's ordered properly against jump label.

Note this isn't an issue in the current tree b/c there is no path to toggle the sites currently before they are initialized, but Jim's series here adds such a path.

Thanks,

-Jason



The only other option I can come up with is that somehow the update
condition in jump_label_add_module() is somehow wrong.