Re: [PATCH 2/3] Fix fsync livelock

From: Mikulas Patocka
Date: Sun Oct 05 2008 - 20:05:19 EST




On Sun, 5 Oct 2008, Arjan van de Ven wrote:

> On Sun, 5 Oct 2008 19:18:22 -0400 (EDT)
> Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote:
>
> > On Sun, 5 Oct 2008, Arjan van de Ven wrote:
> >
> >
> > The problem here is that you have two processes --- one is writing,
> > the other is simultaneously syncing. The syncing process can't
> > distinguish the pages that were created before fsync() was invoked
> > (it has to wait on them) and the pages that were created while
> > fsync() was running (it doesn't have to wait on them) --- so it waits
> > on them all.
>
> for pages it already passed that's not true.
> but yes while you walk, you may find new ones. tough luck.
> >
> > Or, how otherwise would you implement "Submit them all in one go,
> > then wait"? The current code is:
> > you grab page 0, see it is under writeback, wait on it
> > you grab page 1, see it is under writeback, wait on it
> > you grab page 2, see it is under writeback, wait on it
> > you grab page 3, see it is under writeback, wait on it
>
> it shouldn't be doing this. It should be "submit the lot"
> "wait on the lot that got submitted".

And how do you want to implement that "wait on the lot that got
submitted". Note that you can't allocate an array or linked list that
holds pointers to all submitted pages. And if you add anything to the page
structure or radix tree, you have to serialize concurrent sync,fsync
calls, otherwise they'd fight over these bits.

> keeping away new writers is just shifting the latency to an innocent
> party (since the fsync() is just bloody expensive, the person doing it
> should be paying the price, not the rest of the system), and a grave
> mistake.

I assume that if very few people complained about the livelock till now,
very few people will see degraded write performance. My patch blocks the
writes only if the livelock happens, so if the livelock doesn't happen in
unpatched kernel for most people, the patch won't make it worse.

> If the fsync() implementation isn't smart enough, sure, lets improve
> it. But not by shifting latency around... lets make it more efficient
> at submitting IO.
> If we need to invent something like "chained IO" where if you wait on
> the last of the chain, you wait on the entirely chain, so be it.

This looks madly complicated. And ineffective, because if some page was
submitted before fsync() was invoked, and is under writeback while fsync()
is called, fsync() still has to wait on it.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/