Re: [PATCH] kobject: Make sure the parent does not get released before its children

From: Greg Kroah-Hartman
Date: Sun May 24 2020 - 09:14:21 EST


On Sun, May 24, 2020 at 02:57:27PM +0200, Greg Kroah-Hartman wrote:
> On Sat, May 23, 2020 at 08:44:06AM -0700, Randy Dunlap wrote:
> > On 5/23/20 8:36 AM, Greg Kroah-Hartman wrote:
> > > On Wed, May 13, 2020 at 06:18:40PM +0300, Heikki Krogerus wrote:
> > >> In the function kobject_cleanup(), kobject_del(kobj) is
> > >> called before the kobj->release(). That makes it possible to
> > >> release the parent of the kobject before the kobject itself.
> > >>
> > >> To fix that, adding function __kboject_del() that does
> > >> everything that kobject_del() does except release the parent
> > >> reference. kobject_cleanup() then calls __kobject_del()
> > >> instead of kobject_del(), and separately decrements the
> > >> reference count of the parent kobject after kobj->release()
> > >> has been called.
> > >>
> > >> Reported-by: Naresh Kamboju <naresh.kamboju@xxxxxxxxxx>
> > >> Reported-by: kernel test robot <rong.a.chen@xxxxxxxxx>
> > >> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> > >> Suggested-by: "Rafael J. Wysocki" <rafael@xxxxxxxxxx>
> > >> Signed-off-by: Heikki Krogerus <heikki.krogerus@xxxxxxxxxxxxxxx>
> > >> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
> > >> Reviewed-by: Brendan Higgins <brendanhiggins@xxxxxxxxxx>
> > >> Tested-by: Brendan Higgins <brendanhiggins@xxxxxxxxxx>
> > >> Acked-by: Randy Dunlap <rdunlap@xxxxxxxxxxxxx>
> > >> ---
> > >> lib/kobject.c | 30 ++++++++++++++++++++----------
> > >> 1 file changed, 20 insertions(+), 10 deletions(-)
> > >
> > > Stepping back, now that it turns out this patch causes more problems
> > > than it fixes, how is everyone reproducing the original crash here?
> >
> > Just load lib/test_printf.ko and boom!
>
> Thanks, that helps.
>
> Ok, in messing around with the kobject core more, originally we thought
> this was an issue of the kobject uevent happening for the parent pointer
> (when the parent was invalid). so, moving things around some more, and
> now I'm crashing in software_node_release() when we are trying to access
> swnode->parent->child_ids as parent is invalid there.
>
> So I feel like this is a swnode bug, or a use of swnode in a way it
> shouldn't be that the testing framework is exposing somehow.
>
> Let me dig deeper...

Ah, ick, static software nodes trying to be cleaned up in the totally
wrong order. You can't just try to randomly clean up a kobject anywhere
in the middle of the hierarchy, that's flat out not going to work
properly. let me unwind it...


greg k-h