Re: XFS / writeback invoking soft lockup.
From: Dave Chinner
Date: Fri Dec 13 2013 - 05:49:21 EST
On Fri, Dec 13, 2013 at 02:14:07AM -0500, Dave Jones wrote:
> I can hit this pretty reliably on one of my slower test machines.
> (8gb ram, 1 slow sata disk)
>
> the machine is pretty responsive, and recovers after a while.
> anything we can do to shut it up ?
Actually, I think this indicates a problem.
> BUG: soft lockup - CPU#2 stuck for 22s! [kworker/u8:2:8479]
...
> Call Trace:
> [<c112f8f8>] lru_add_drain+0x1c/0x39
> [<c112f934>] __pagevec_release+0x10/0x26
> [<c112baba>] write_cache_pages+0x2f9/0x486
That code in write_cache_pages():
1907 while (!done && (index <= end)) {
1908 int i;
1909
1910 nr_pages = pagevec_lookup_tag(&pvec, mapping, &index, tag,
1911 min(end - index, (pgoff_t)PAGEVEC_SIZE-1) + 1)
1912 if (nr_pages == 0)
1913 break;
1914
1915 for (i = 0; i < nr_pages; i++) {
1916 struct page *page = pvec.pages[i];
....
....
2001 }
2002 pagevec_release(&pvec);
2003 cond_resched();
2004 }
So after all the pages in a pagevec are processed, we release the
CPU before we grab the next pagevec. This softlockup implies we
have been processing this pagevec for 22s. That tells me the code
is actually stuck spinning on something, not that this is a false
positive. i.e. it should not take 22s to process 14 pages.
[ Yes, I know XFS can process more than that ->writepage, but it's
still only a millisecond of work if it doesn't block on anything.
And it can't be blocking, otherwise we wouldn't be firing the
softlockup warning. ]
The page cache LRU code is a maze of twisty per-cpu passages that go
deep into the mm subsystem and memcg code - I'm not really sure what
all that code is doing, so you'll probably have to ask someone who
knows about that code.
All I can say is that there doesn't look to be any obvious signs
that this is a XFS or writeback problem fom the stack trace, and
without more information or a reproducable test case I'm not going
to be able to understand the cause.
Is the problem reproducable, or is it just a one-off?
Cheers,
Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/