Re: [PATCH v0] kernfs: Skip kernfs_put of parent from child node

From: Gaurav Kohli
Date: Mon Apr 08 2019 - 06:16:59 EST



On 4/5/2019 6:01 PM, Mukesh Ojha wrote:

On 4/5/2019 5:40 PM, Greg KH wrote:
On Fri, Apr 05, 2019 at 05:13:00PM +0530, Gaurav Kohli wrote:
On 4/5/2019 5:03 PM, Greg KH wrote:
On Fri, Apr 05, 2019 at 04:51:07PM +0530, Gaurav Kohli wrote:
While adding kernfs node for child to the parent kernfs
node and when child node founds that parent kn count is
zero, then below comes like:

WARNING: fs/kernfs/dir.c:494 kernfs_get+0x64/0x88

This indicates that parent is in kernfs_put path/ or already
freed, and if the child node keeps continue to
make new kernfs node, then there is chance of
below race for parent node:

CPU0ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ CPU1
//Parent nodeÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ //child node
kernfs_put
atomic_dec_and_test(&kn->count)
//count is 0, so continue
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kernfs_new_node(child)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kernfs_get(parent);
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ //increment parent count to 1
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ //warning come as parent count is 0
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ /* link in */
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kernfs_add_one(kn);
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ // this should fail as parent is
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ //in free path.
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kernfs_put(child)
kmem_cache_free(parent)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kmem_cache_free(child)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kn = parent
atomic_dec_and_test(&kn->count))
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ //this is 0 now, so release will
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ continue for parent.
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kmem_cache_free(parent)

To prevent this race, child simply has to decrement count of parent
kernfs node and keep continue the free path for itself.

Signed-off-by: Gaurav Kohli <gkohli@xxxxxxxxxxxxxx>
Signed-off-by: Mukesh Ojha <mojha@xxxxxxxxxxxxxx>

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index b84d635..d5a36e8 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -515,7 +515,6 @@ void kernfs_put(struct kernfs_node *kn)
ÂÂÂÂÂÂ if (!kn || !atomic_dec_and_test(&kn->count))
ÂÂÂÂÂÂÂÂÂÂ return;
ÂÂÂÂÂÂ root = kernfs_root(kn);
- repeat:
ÂÂÂÂÂÂ /*
ÂÂÂÂÂÂÂ * Moving/renaming is always done while holding reference.
ÂÂÂÂÂÂÂ * kn->parent won't change beneath us.
@@ -545,8 +544,8 @@ void kernfs_put(struct kernfs_node *kn)
ÂÂÂÂÂÂ kn = parent;
ÂÂÂÂÂÂ if (kn) {
-ÂÂÂÂÂÂÂ if (atomic_dec_and_test(&kn->count))
-ÂÂÂÂÂÂÂÂÂÂÂ goto repeat;
+ÂÂÂ /* Parent may be on free path, so simply decrement the count */
That's the wrong indentation :(

And how are you hitting this issue? What user of kernfs is causing
this?
Hi Greg,

Thanks, will fix comment indentation, seen during sys-executor running:

We have only one instance , In logs below warning also came:

WARNING: CPU: 4 kernel/msm-4.14/fs/kernfs/dir.c:494 kernfs_get+0x64/0x88

which indicated parent is in put path.

[Â 160.125151] Disabling lock debugging due to kernel taint
[Â 160.130626] INFO: Allocated in __kernfs_new_node+0x8c/0x3c0 age=11 cpu=2
pid=7098
[Â 160.138314] ÂÂÂ kmem_cache_alloc+0x358/0x388
[Â 160.142445] ÂÂÂ __kernfs_new_node+0x8c/0x3c0
[Â 160.146590] ÂÂÂ kernfs_new_node+0x80/0xc8
[Â 160.150462] ÂÂÂ kernfs_create_dir_ns+0x44/0xfc
[Â 160.154777] ÂÂÂ sysfs_create_dir_ns+0xa8/0x130
[Â 160.158416] CPU5: update max cpu_capacity 1024
[Â 160.159085] ÂÂÂ kobject_add_internal+0x278/0x650
[Â 160.163567] ÂÂÂ kobject_add_varg+0xe0/0x130
[Â 160.167606] ÂÂÂ kobject_add+0x15c/0x1d0
[Â 160.168452] CPU5: update max cpu_capacity 780
[Â 160.171287] ÂÂÂ get_device_parent+0x2d0/0x34c
[Â 160.175510] ÂÂÂ device_add+0x240/0xde0
[Â 160.178371] CPU6: update max cpu_capacity 916
[Â 160.179108] ÂÂÂ input_register_device+0x5f4/0xa0c
[Â 160.183686] ÂÂÂ uinput_ioctl_handler+0x1184/0x2198
[Â 160.202436] INFO: Freed in kernfs_put+0x2c8/0x434 age=14 cpu=0 pid=7096
[Â 160.209230] ÂÂÂ kernfs_put+0x2c8/0x434
[Â 160.212825] ÂÂÂ kobject_del+0x50/0xcc
[Â 160.216332] ÂÂÂ cleanup_glue_dir+0x124/0x16c
[Â 160.220456] ÂÂÂ device_del+0x55c/0x5c8
[Â 160.224047] ÂÂÂ __input_unregister_device+0x274/0x2a8
[Â 160.228974] ÂÂÂ input_unregister_device+0x90/0xd0
[Â 160.233553] ÂÂÂ uinput_destroy_device+0x15c/0x1dc
[Â 160.238131] ÂÂÂ uinput_release+0x44/0x5c
[Â 160.241898] ÂÂÂ __fput+0x1f4/0x4e4
[Â 160.245127] ÂÂÂ ____fput+0x20/0x2c


during code review, I have found race between kernfs parent put call and
child get call.
So this is a sysfs usage of this?

yes

ÂÂ Using input devices or cpu devices
for the stress test?

input devices ..

[ 1714.090310] input: syz1 as /devices/virtual/input/input191
[ 1714.223037] input: syz1 as /devices/virtual/input/input192

..

[ 1714.428228] input: syz1 as /devices/virtual/input/input193

..
[ 1714.528256] input: syz1 as /devices/virtual/input/input194
..

[ 1714.756481] input: syz1 as /devices/virtual/input/input195
..
[ 1714.831920] input: syz1 as /devices/virtual/input/input196

..

Cheers,
Mukesh


Hi Greg, Tejun,

With earlier patch that i have posted, there is chance of memory leak for parent, if below case occurs:

Add parent

Add child

put parent

put child -> this will skip the freeing of parent node with above patch.

So to avoid race mentioned in earlier mail, creation of kernfs should not proceed, if count of parent is zero, Instead

of warning, we should return from that place.

Can you please check below patch for the same, or please let us know some other way to fix the race.


diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 89d1dc1..c41085a 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -676,6 +676,11 @@ struct kernfs_node *kernfs_new_node(struct kernfs_node *parent,
Â{
ÂÂÂÂÂÂÂ struct kernfs_node *kn;

+ÂÂÂÂÂÂ if (!atomic_read(&parent->count)) {
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ WARN_ON(1);
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ return NULL;
+ÂÂÂÂÂÂ }
+
ÂÂÂÂÂÂÂ kn = __kernfs_new_node(kernfs_root(parent), name, mode, flags);
ÂÂÂÂÂÂÂ if (kn) {
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ kernfs_get(parent);


Regards

Gaurav




thanks,

greg k-h

--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center,
Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.