Re: [GIT, RFC] Killing the Big Kernel Lock

From: Frederic Weisbecker
Date: Sun Mar 28 2010 - 19:24:56 EST


On Sun, Mar 28, 2010 at 10:34:54PM +0100, Arnd Bergmann wrote:
> On Sunday 28 March 2010, Frederic Weisbecker wrote:
> > On Sun, Mar 28, 2010 at 09:05:50PM +0100, Arnd Bergmann wrote:
> > > > General thoughts:
> > > >
> > > > ".llseek = NULL," so far meant "do the Right Thing on lseek() and
> > > > friends, as far as the fs core can tell". Shouldn't we keep it that
> > > > way? It's as close to other ".method = NULL," as it can get, which
> > > > either mean "silently skip this method if it doesn't matter" (e.g.
> > > > .flush) or "fail attempts to use this method with a fitting errno" (e.g.
> > > > .write).
> > >
> > > My series changes the default from 'default_llseek' to 'generic_file_llseek',
> > > which is almost identical, except for taking the inode mutex instead of the
> > > BKL.
> >
> >
> > What if another file operation changes the file pointer while holding the bkl?
> > You're not protected anymore in this case.
> >
>
> Exactly, that's why I changed all the drivers to set default_llseek explicitly.


Ah ok.


> Even this is very likely not needed in more than a handful of drivers (if any),
> for a number of reasons:
>
> - sys_read/sys_write *never* hold any locks while calling file_pos_write(),
> which is the only place they get updated for regular files.


Yeah sure. But the pushdown (or step by step replacement
with generic_file_llseek) is still necessary to ensure every
places are fine.



> - concurrent llseek plus other file operations on the same file descriptor
> usually already have an undefined outcome.


Yeah.



> - when I started inspecting drivers that look at file->f_pos themselves (not
> the read/write operation arguments), I found that practically all of them
> are doing this in a totally broken way!


Hehe :)



> - The only think we'd probably ever want to lock against in llseek
> is readdir, which is not used in any drivers, but only in file systems.


Right.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/