Re: 3.9-rc5: Encountedred INFO: rcu_sched self-detected stall onCPU due to 09a9f1d27

From: Hugh Dickins
Date: Mon Apr 15 2013 - 17:47:32 EST


On Mon, 15 Apr 2013, Vivek Goyal wrote:
> On Mon, Apr 15, 2013 at 01:59:29PM -0400, Vivek Goyal wrote:
> > On Mon, Apr 15, 2013 at 01:34:24PM -0400, Vivek Goyal wrote:
> > > On Mon, Apr 15, 2013 at 12:35:52PM -0400, Vivek Goyal wrote:
> > >
> > > [..]
> > > > > My first guess would be that mmap_sem is held during exec, so you
> > > > > can't have __mm_populate() try holding it recursively.
> > > >
> > > > I think it is not mmap_sem as even with VM_LOCKED, we take mmap_sem
> > > > and things are fine.
> > > >
> > > > So things work till 3.8 and break in 3.8-rc1 (with both VM_LOCKED and
> > > > VM_POPULATE specifed). I will do git bisect and try to figure out which
> > > > is first commit which has the issue.
> > >
> > > Ok, following seems to be first bad commit.
> > >
> > > commit bebeb3d68b24bb4132d452c5707fe321208bcbcd
> > > Author: Michel Lespinasse <walken@xxxxxxxxxx>
> > > Date: Fri Feb 22 16:32:37 2013 -0800
> > >
> > > mm: introduce mm_populate() for populating new vmas
> > >
>
> Michel,
>
> An interesting observation. After this commit looks like simple
> mmap(MAP_LOCKED) of a file was broken and it would hang and give RCU stall
> warning similar to my patch of locking /sbin/kexec.
>
> But in latest kernel mmap(MAP_LOCKED) does not hang. So looks like
> this problem got fixed in a patch after this first bad commit. But
> locking /sbin/kexec issue still remains.

I haven't tried to understand that. But I did just try your
def_flags |= VM_LOCKED hack to fs/binfmt_elf.c, and CONFIG_DEBUG_VM=y
quickly suggested the patch below - without the BUG, yes, __mm_populate
might well loop forever trying to populate 0 pages.

Whether a fix is actually needed, and whether it should be fixed here
or elsewhere, I'll leave to Michel.

Hugh

--- 3.9-rc7/mm/mlock.c 2013-04-01 09:08:05.736012852 -0700
+++ linux/mm/mlock.c 2013-04-15 14:20:24.454773245 -0700
@@ -397,8 +397,7 @@ int __mm_populate(unsigned long start, u
long ret = 0;

VM_BUG_ON(start & ~PAGE_MASK);
- VM_BUG_ON(len != PAGE_ALIGN(len));
- end = start + len;
+ end = start + PAGE_ALIGN(len);

for (nstart = start; nstart < end; nstart = nend) {
/*

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/