Re: [discuss] Re: [PATCH] fs/compat.c: rwsem instead of BKL around ioctl32_hash_table

From: Andi Kleen
Date: Fri Sep 03 2004 - 09:38:20 EST


On Thu, Sep 02, 2004 at 03:26:49PM -0700, Roland Dreier wrote:
> Andi> It does not make much sense because the ioctl will take the
> Andi> BKL anyways.
>
> True, but it seems pretty ugly to protect the ioctl32 hash with the
> BKL. I think the greater good of reducing use of the BKL should be
> looked at.

Replacing one lock with two is IMHO a bad idea. Making the number
of locks smaller and avoiding the locking wall IMHO has priority
even over BKL removal.

>
> Andi> If you wanted to fix it properly better make it use RCU -
> Andi> but it cannot work for the case of calling a compat handler.
>
> I'm not sure I follow what you're saying. When I looked at this, at
> first I thought RCU would be better for the hash lookup, but I didn't
> see a way to prevent a compat handler from being removed while it was
> running. That's why I moved to a semaphore, which would hold off the
> removal until the handler was done running. Is this what you mean?
> Do you see a way to uose RCU here?

The code currently assumes that the compat code is either
in the kernel or in the same module who implements the
device (then the high level module count handling for
the file descriptor takes care of it)

The BKL couldn't protect again removal of sleeping compat
handlers anyways because the BKL is dropped during a
schedule, and they all can sleep in user accesses.
During this scheduling window the module could be unloaded
if its count was zero. But with the assumption above this
cannot happen.

So basically the locking there is not to protect against
running handlers, just to ensure consistency during
the list walking. The list isn't touched anymore
after the compat handler runs, so the sleeping in there
is no problem.

RCU could be used as well to protect the list because
there is no sleep involved.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/