Re: possible deadlock in sk_clone_lock

From: Michal Hocko
Date: Mon Mar 01 2021 - 16:54:35 EST


On Mon 01-03-21 08:39:22, Shakeel Butt wrote:
> On Mon, Mar 1, 2021 at 7:57 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
[...]
> > Then how come this can ever be a problem? in_task() should exclude soft
> > irq context unless I am mistaken.
> >
>
> If I take the following example of syzbot's deadlock scenario then
> CPU1 is the one freeing the hugetlb pages. It is in the process
> context but has disabled softirqs (see __tcp_close()).
>
> CPU0 CPU1
> ---- ----
> lock(hugetlb_lock);
> local_irq_disable();
> lock(slock-AF_INET);
> lock(hugetlb_lock);
> <Interrupt>
> lock(slock-AF_INET);
>
> So, this deadlock scenario is very much possible.

OK, I see the point now. I was focusing on the IRQ context and hugetlb
side too much. We do not need to be freeing from there. All it takes is
to get a dependency chain over a common lock held here. Thanks for
bearing with me.

Let's see whether we can make hugetlb_lock irq safe.

--
Michal Hocko
SUSE Labs