Re: Sleeping BUG in khugepaged for i586

From: Matthew Wilcox
Date: Thu Jun 08 2017 - 13:06:11 EST


On Thu, Jun 08, 2017 at 04:48:31PM +0200, Michal Hocko wrote:
> On Wed 07-06-17 13:56:01, David Rientjes wrote:
> > I agree it's probably going to bisect to 338a16ba15495 since it's the
> > cond_resched() at the line number reported, but I think there must be
> > something else going on. I think the list of locks held by khugepaged is
> > correct because it matches with the implementation. The preempt_count(),
> > as suggested by Andrew, does not. If this is reproducible, I'd like to
> > know what preempt_count() is.
>
> collapse_huge_page
> pte_offset_map
> kmap_atomic
> kmap_atomic_prot
> preempt_disable
> __collapse_huge_page_copy
> pte_unmap
> kunmap_atomic
> __kunmap_atomic
> preempt_enable
>
> I suspect, so cond_resched seems indeed inappropriate on 32b systems.

Then why doesn't it trigger on 64-bit systems too?

#ifndef ARCH_HAS_KMAP
...
static inline void *kmap_atomic(struct page *page)
{
preempt_disable();
pagefault_disable();
return page_address(page);
}
#define kmap_atomic_prot(page, prot) kmap_atomic(page)


... oh, wait, I see. Because pte_offset_map() doesn't call kmap_atomic()
on 64-bit. Indeed, it doesn't necessarily call kmap_atomic() on 32-bit
either; only with CONFIG_HIGHPTE enabled. How much of a performance
penalty would it be to call kmap_atomic() unconditionally on 64 bit to
make sure that this kind of problem doesn't show on 32-bit systems only?