Re: linux-next: Tree for Dec 21

From: Hugh Dickins
Date: Sat Dec 24 2011 - 00:13:52 EST


On Thu, 22 Dec 2011, Tejun Heo wrote:
> On Thu, Dec 22, 2011 at 03:46:39PM -0800, Tejun Heo wrote:
> > On Thu, Dec 22, 2011 at 03:44:27PM -0800, Andrew Morton wrote:
> > > > Weird, I can't reproduce the problem on block/for-3.3/core. Trying
> > > > linux-next... hmmm, it works there too.
> > >
> > > This machine is next to my desk, about 50 yards from your cube ;)
> >
> > Heh, physical access feels like such distant concept. :)
> >
> > I'll test with the config and if I still can't reproduce it, play with
> > your machine.
>
> Couldn't reproduce it on block/for-3.3 or next & you were already
> gone. Is anyone else seeing this?

Twice today, on ThinkPad T420s running 3.2.0-rc6-next-20111222.
I haven't seen it at all under heavy load, but twice when simply
rebuilding the kernel - I think both times it hung with
"LD whatever/built-in.o" the last line on screen.

I had (a variant of) kdb in, here's the stack it gave me, but I think
I've got a bug in there which has missed out a number of stackframes:
so don't waste time puzzling over any anomalies in it, but there's
enough to suggest it's the same as Andrew was seeing.

ffff880013ac2100 28524 28522 1* D ffff880013ac2538 sh
RSP RIP Function (args)
ffff88004165f820 ffffffff814e559a _raw_spin_unlock_irq+0x31
ffff88004165f858 ffffffff811d2867 get_request_wait+0xab
ffff88004165f8b8 ffffffff811cfb75 elv_merge+0xa0
ffff88004165fd18 ffffffff810ca90c do_writepages+0x1f
ffff88004165fd28 ffffffff810c2671 __filemap_fdatawrite_range+0x4e
ffff88004165fd68 ffffffff810c2e92 filemap_flush+0x17
ffff88004165fd78 ffffffff8116533e ext4_alloc_da_blocks+0x28
ffff88004165fd88 ffffffff81160f6a ext4_release_file+0x2e
ffff88004165fdb8 ffffffff811077d4 __fput+0x107
ffff88004165fe08 ffffffff81107899 fput+0x15
ffff88004165fe18 ffffffff81104037 filp_close+0x6b
ffff88004165fe48 ffffffff81056b47 close_files+0x16a
ffff88004165fea8 ffffffff81057f31 put_files_struct+0x21
ffff88004165fed8 ffffffff81058107 exit_files+0x46
ffff88004165ff08 ffffffff81058648 do_exit+0x20e
ffff88004165ff48 ffffffff810588d1 do_group_exit+0x7d
ffff88004165ff78 ffffffff8105890e sys_exit_group+0x12

I interrupted a few more times, yes, once or twice I caught it
in some cfq io_context business: didn't take much notice because
I thought I'd saved the stack to log, but it hasn't appeared in my
/var/log/messages after reboot. Once or twice there was another
sh running on another cpu, showing a very similar stack.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/