Re: [RFC PATCH] mm/hugetlb: make hugetlb_lock irq safe

From: Aneesh Kumar K.V
Date: Thu Sep 06 2018 - 00:00:14 EST


On 09/05/2018 07:18 PM, Matthew Wilcox wrote:
On Wed, Sep 05, 2018 at 06:56:19PM +0530, Aneesh Kumar K.V wrote:
On 09/05/2018 06:34 PM, Matthew Wilcox wrote:
On Wed, Sep 05, 2018 at 04:53:41PM +0530, Aneesh Kumar K.V wrote:
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.

How do you go from "can be taken in softirq context" problem report to
"must disable hard interrupts" solution? Please explain why spin_lock_bh()
is not a sufficient fix.

swapper/68/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
0000000052a030a7 (hugetlb_lock){+.?.}, at: free_huge_page+0x9c/0x340
{SOFTIRQ-ON-W} state was registered at:
lock_acquire+0xd4/0x230
_raw_spin_lock+0x44/0x70
set_max_huge_pages+0x4c/0x360
hugetlb_sysctl_handler_common+0x108/0x160
proc_sys_call_handler+0x134/0x190
__vfs_write+0x3c/0x1f0
vfs_write+0xd8/0x220

Also, this only seems to trigger here. Is it possible we _already_
have softirqs disabled through every other code path, and it's just this
one sysctl handler that needs to disable softirqs? Rather than every
lock access?

Are you asking whether I looked at moving that put_page to a worker thread?

No. I'm asking "why not disable softirqs in the sysctl handler". Or
perhaps equivalently, just replace spin_lock() with spin_lock_bh() in
set_max_huge_pages().


Disabling only in sysctl handler is not enough right? Every usage of locks taken by the page destructor need to be converted to disable softirqs right?


I didn't. The reason I looked at current patch is to enable the usage of
put_page() from irq context. We do allow that for non hugetlb pages. So was
not sure adding that additional restriction for hugetlb
is really needed. Further the conversion to irqsave/irqrestore was
straightforward.

straightforward, sure. but is it the right thing to do? do we want to
be able to put_page() a hugetlb page from hardirq context?


-aneesh