Re: [PATCH] kobject: Make sure the parent does not get released before its children

From: Randy Dunlap
Date: Sat May 23 2020 - 11:44:25 EST


On 5/23/20 8:36 AM, Greg Kroah-Hartman wrote:
> On Wed, May 13, 2020 at 06:18:40PM +0300, Heikki Krogerus wrote:
>> In the function kobject_cleanup(), kobject_del(kobj) is
>> called before the kobj->release(). That makes it possible to
>> release the parent of the kobject before the kobject itself.
>>
>> To fix that, adding function __kboject_del() that does
>> everything that kobject_del() does except release the parent
>> reference. kobject_cleanup() then calls __kobject_del()
>> instead of kobject_del(), and separately decrements the
>> reference count of the parent kobject after kobj->release()
>> has been called.
>>
>> Reported-by: Naresh Kamboju <naresh.kamboju@xxxxxxxxxx>
>> Reported-by: kernel test robot <rong.a.chen@xxxxxxxxx>
>> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
>> Suggested-by: "Rafael J. Wysocki" <rafael@xxxxxxxxxx>
>> Signed-off-by: Heikki Krogerus <heikki.krogerus@xxxxxxxxxxxxxxx>
>> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx>
>> Reviewed-by: Brendan Higgins <brendanhiggins@xxxxxxxxxx>
>> Tested-by: Brendan Higgins <brendanhiggins@xxxxxxxxxx>
>> Acked-by: Randy Dunlap <rdunlap@xxxxxxxxxxxxx>
>> ---
>> lib/kobject.c | 30 ++++++++++++++++++++----------
>> 1 file changed, 20 insertions(+), 10 deletions(-)
>
> Stepping back, now that it turns out this patch causes more problems
> than it fixes, how is everyone reproducing the original crash here?

Just load lib/test_printf.ko and boom!


> Is it just the KUNIT_DRIVER_PE_TEST that is causing the issue?
>
> In looking at 7589238a8cf3 ("Revert "software node: Simplify
> software_node_release() function""), the log messages there look
> correct. sysfs can't create a duplicate file, and so when your test is
> written to try to create software nodes, you always have to check the
> return value. If you run the test in parallel, or before another test
> has had a chance to clean up, the function will fail, correctly.
>
> So what real-world thing is this test "failure" trying to show?


--
~Randy