Re: [next-20160615] kernel BUG at mm/rmap.c:1251!

From: Michal Hocko
Date: Thu Jun 16 2016 - 05:41:49 EST


On Thu 16-06-16 18:23:45, Sergey Senozhatsky wrote:
> On (06/16/16 10:58), Michal Hocko wrote:
> > > [..]
> > > [ 272.687656] vma ffff8800b855a5a0 start 00007f3576d58000 end 00007f3576f66000
> > > next ffff8800b977d2c0 prev ffff8800bdfb1860 mm ffff8801315ff200
> > > prot 8000000000000025 anon_vma ffff8800b7e583b0 vm_ops (null)
> > > pgoff 7f3576d58 file (null) private_data (null)
> > > flags: 0x100073(read|write|mayread|maywrite|mayexec|account)
> > > [ 272.691793] ------------[ cut here ]------------
> > > [ 272.692820] kernel BUG at mm/rmap.c:1251!
> >
> > Is this?
> > page_add_new_anon_rmap:
> > VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma)
> > [...]
>
> I think it is
>
> 1248 void page_add_new_anon_rmap(struct page *page,
> 1249 struct vm_area_struct *vma, unsigned long address, bool compound)
> 1250 {
> 1251 int nr = compound ? hpage_nr_pages(page) : 1;
> 1252
> 1253 VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma);
> 1254 __SetPageSwapBacked(page);
>
> > > [ 272.727842] BUG: sleeping function called from invalid context at include/linux/sched.h:2960
> >
> > If yes then I am not sure we can do much about the this part. BUG_ON in
> > an atomic context is unfortunate but the BUG_ON points out a real bug so
> > we shouldn't drop it because of the potential atomic context. The above
> > VM_BUG_ON should definitely be addressed. I thought that Vlastimil has
> > pointed out some issues with the khugepaged lock inconsistencies which
> > might lead to issues like this.
>
> collapse_huge_page() ->mmap_sem fixup patch (http://marc.info/?l=linux-mm&m=146495692807404&w=2)
> is in next-20160615. or do you mean some other patch?

Yes that's what I meant, but I haven't reviewed the patch to see whether
it is correct/complete. It would be good to see whether the issue is
related to those changes.
--
Michal Hocko
SUSE Labs