Re: [patch 1/3] fix illogical behavior in balance_dirty_pages()

From: Miklos Szeredi
Date: Mon Mar 26 2007 - 05:48:45 EST


> > Well, not a picture, but a sort of indented call trace:
> >
> > [some process, which has a fuse file writably mmaped]
> > write fault on upper filesystem
> > balance_dirty_pages
> > loop...
> > submit write requests
>
> This, I assume, is the upper fs
>
> > ---------------------------------
> > [fuse loopback fs thread 1]
> > read request from /dev/fuse
> > sys_write
> > mutex_lock(i_mutex)
> > ...
> > copy data to page cache
> > balance_dirty_pages
> > loop ...
> > submit write requests
> > write requests completed ...
> > dirty still over limit ...
> > ... loop forever
> >
> > [fuse loopback fs thread 2]
> > read request from /dev/fuse
> > sys_write
> > mute_lock(i_mutex) blocks
>
> And these, I assume, are handling what you term the lower fs.

All of this is actually the upper, fuse filesystem (maybe except the
filemap/page-writeback parts, which could be argued to belong to the
lower, ext3 filesystem).

Above the line is the kernel part, below the line is the userspace
part.

> > The lower filesystem (e.g. ext3) has completed the single write
> > request that was sent to it, and then it's just looping in
> > balance_dirty_pages. The upper (fuse) filesystem has all the dirty
> > data (over the threshold), either still dirty or waiting in the
> > request queue as writeback.
> >
> > Does this help?
>
> yup.
>
> Interesting problem. I don't suppose that it'd be appreiated if I were to
> commend the use of O_DIRECT for handling the lower fs ;)

Not really ;) One of the goals of fuse is to not restrict the
filesystem daemon in any way. There are zillions of ways to work
around this deadlock, but I don't even want to make it even
theoretically possible.

> Let me think about that a bit, after I've made the latest shitpile people
> have inflicted upon me begin to look like it has a chance of compiling.

OK.

Thanks,
Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/