Re: [RFC PATCH] mm/hugetlb: make hugetlb_lock irq safe

From: Aneesh Kumar K.V
Date: Wed Sep 05 2018 - 09:26:31 EST


On 09/05/2018 06:34 PM, Matthew Wilcox wrote:
On Wed, Sep 05, 2018 at 04:53:41PM +0530, Aneesh Kumar K.V wrote:
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.

How do you go from "can be taken in softirq context" problem report to
"must disable hard interrupts" solution? Please explain why spin_lock_bh()
is not a sufficient fix.

swapper/68/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
0000000052a030a7 (hugetlb_lock){+.?.}, at: free_huge_page+0x9c/0x340
{SOFTIRQ-ON-W} state was registered at:
lock_acquire+0xd4/0x230
_raw_spin_lock+0x44/0x70
set_max_huge_pages+0x4c/0x360
hugetlb_sysctl_handler_common+0x108/0x160
proc_sys_call_handler+0x134/0x190
__vfs_write+0x3c/0x1f0
vfs_write+0xd8/0x220

Also, this only seems to trigger here. Is it possible we _already_
have softirqs disabled through every other code path, and it's just this
one sysctl handler that needs to disable softirqs? Rather than every
lock access?

Are you asking whether I looked at moving that put_page to a worker thread? I didn't. The reason I looked at current patch is to enable the usage of put_page() from irq context. We do allow that for non hugetlb pages. So was not sure adding that additional restriction for hugetlb
is really needed. Further the conversion to irqsave/irqrestore was
straightforward.

Now with respect to making sure we don't have irq already disabled in those code paths, I did check that. But let me know if you find anything I missed.

I'm not seeing any analysis in this patch description, just a kneejerk
"lockdep complained, must disable interrupts".


-aneesh