Re: [PATCH RFC] ext4: fix potential race between online resizing and write operations

From: Joel Fernandes
Date: Mon Feb 24 2020 - 20:48:55 EST


[..]
> > > > debug_objects bits wouldn't work obviously for the !emergency kvfree case,
> > > > not sure what we can do there.
> > > >
> > > Agree.
> > >
> > > Thank you, Joel, for your comments!
> >
> > No problem, I think we have a couple of ideas here.
> >
> > What I also wanted to do was (may be after all this), see if we can create an
> > API for head-less kfree based on the same ideas. Not just for arrays for for
> > any object. Calling it, say, kfree_rcu_headless() and then use the bulk array
> > as we have been doing. That would save any users from having an rcu_head --
> > of course with all needed warnings about memory allocation failure. Vlad,
> > What do you think? Paul, any thoughts on this?
> >
> I like it. It would be more clean interface. Also there are places where
> people do not embed the rcu_head into their stuctures for some reason
> and do like:
>
>
> <snip>
> synchronize_rcu();
> kfree(p);
> <snip>
>
> <snip>
> urezki@pc636:~/data/ssd/coding/linux-rcu$ find ./ -name "*.c" | xargs grep -C 1 -rn "synchronize_rcu" | grep kfree
> ./arch/x86/mm/mmio-mod.c-314- kfree(found_trace);
> ./kernel/module.c-3910- kfree(mod->args);
> ./kernel/trace/ftrace.c-5078- kfree(direct);
> ./kernel/trace/ftrace.c-5155- kfree(direct);
> ./kernel/trace/trace_probe.c-1087- kfree(link);
> ./fs/nfs/sysfs.c-113- kfree(old);
> ./fs/ext4/super.c-1701- kfree(old_qname);
> ./net/ipv4/gre.mod.c-36- { 0xfc3fcca2, "kfree_skb" },
> ./net/core/sysctl_net_core.c-143- kfree(cur);
> ./drivers/crypto/nx/nx-842-pseries.c-1010- kfree(old_devdata);
> ./drivers/misc/vmw_vmci/vmci_context.c-692- kfree(notifier);
> ./drivers/misc/vmw_vmci/vmci_event.c-213- kfree(s);
> ./drivers/infiniband/core/device.c:2162: * synchronize_rcu before the netdev is kfreed, so we
> ./drivers/infiniband/hw/hfi1/sdma.c-1337- kfree(dd->per_sdma);
> ./drivers/net/ethernet/myricom/myri10ge/myri10ge.c-3582- kfree(mgp->ss);
> ./drivers/net/ethernet/myricom/myri10ge/myri10ge.mod.c-156- { 0x37a0cba, "kfree" },
> ./drivers/net/ethernet/mellanox/mlx5/core/fpga/tls.c:286: synchronize_rcu(); /* before kfree(flow) */
> ./drivers/net/ethernet/mellanox/mlxsw/core.c-1504- kfree(rxl_item);
> ./drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c-6648- kfree(adapter->mbox_log);
> ./drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c-6650- kfree(adapter);
> ./drivers/block/drbd/drbd_receiver.c-3804- kfree(old_net_conf);
> ./drivers/block/drbd/drbd_receiver.c-4176- kfree(old_disk_conf);
> ./drivers/block/drbd/drbd_state.c-2074- kfree(old_conf);
> ./drivers/block/drbd/drbd_nl.c-1689- kfree(old_disk_conf);
> ./drivers/block/drbd/drbd_nl.c-2522- kfree(old_net_conf);
> ./drivers/block/drbd/drbd_nl.c-2935- kfree(old_disk_conf);
> ./drivers/mfd/dln2.c-178- kfree(i);
> ./drivers/staging/fwserial/fwserial.c-2122- kfree(peer);
> <snip>

Wow that's pretty amazing, looks like could be very useful.

Do you want to continue your patch and then post it, and we can discuss it?
Or happy to take it as well.

We could split work into 3 parts:
1. Make changes for 2 separate per-CPU arrays. One for vfree and one for kfree.
2. Add headless support for both as alternative API.
3. Handle the low memory case somehow (I'll reply to the other thread).

May be we can start with 1. where you can clean up your patch and post, and
then I/we can work based on that?

thanks,

- Joel