Re: [BUG] fatal hang untarring 90GB file, possibly writebackrelated.

From: Colin Ian King
Date: Thu Apr 28 2011 - 09:42:30 EST



On Thu, 2011-04-28 at 08:29 -0400, Chris Mason wrote:
> Excerpts from Colin Ian King's message of 2011-04-28 07:36:30 -0400:
> > One more data point to add, I've been looking at an identical issue when
> > copying large amounts of data. I bisected this - and the lockups occur
> > with commit
> > 3e7d344970673c5334cf7b5bb27c8c0942b06126 - before that I don't see the
> > issue. With this commit, my file copy test locks up after ~8-10
> > iterations, before this commit I can copy > 100 times and don't see the
> > lockup.
>
> Well, that's really interesting. I tried with compaction on here and
> couldn't trigger it, but this (very very lightly) tested patch might
> help.
>
Thanks Chris,

I've given this a soak test but I still see the same lockup.

> It moves the writeout throttle before the goto restart, and also makes
> sure we do at least one cond_resched before we loop.
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 6771ea7..cb08b41 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1934,12 +1934,14 @@ restart:
> if (inactive_anon_is_low(zone, sc))
> shrink_active_list(SWAP_CLUSTER_MAX, zone, sc, priority, 0);
>
> + throttle_vm_writeout(sc->gfp_mask);
> +
> /* reclaim/compaction might need reclaim to continue */
> if (should_continue_reclaim(zone, nr_reclaimed,
> - sc->nr_scanned - nr_scanned, sc))
> + sc->nr_scanned - nr_scanned, sc)) {
> + cond_resched();
> goto restart;
> -
> - throttle_vm_writeout(sc->gfp_mask);
> + }
> }
>
> /*


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/