Re: [Regression] Crash in load_module() while freeing args

From: Rusty Russell
Date: Wed May 26 2010 - 04:01:17 EST


On Wed, 26 May 2010 09:17:32 am Linus Torvalds wrote:
>
> On Wed, 26 May 2010, Rafael J. Wysocki wrote:
> >
> > I'm not able to reproduce the issue with the following commit reverted:
> >
> > commit 480b02df3aa9f07d1c7df0cd8be7a5ca73893455
> > Author: Rusty Russell <rusty@xxxxxxxxxxxxxxx>
> > Date: Wed May 19 17:33:39 2010 -0600
> >
> > module: drop the lock while waiting for module to complete initialization.
>
> Hmm. That does seem to be buggy. We can't just drop and re-take the lock:
> that may make sense _internally_ as far as resolve_symbol() itself is
> concerned, but the caller will its own local variables, and some of those
> will no longer be valid if the lock was dropped.

Well, yes, obviously I missed something :( I'll look at it tonight after
Arabella is asleep.

> That commit also changes the return value semantics of "use_module()",
> which is an exported interface where the only users seem to be
> out-of-kernel (the only in-kernel use is in kernel/module.c itself). That
> seems like a really really bad idea too.

The kprobes guys: they were cc'd about the change.

> So I think reverting it is definitely the right thing to do. The commit
> seems fundamentally broken.

> And having modules do request_module() in
> their init functions has always been invalid anyway, so that excuse
> doesn't really seem to be a reason to do anything crazy like this either.

I'd have to look back through the pre-git history, but we've dropped the
lock around the initfn for a long time now because people wanted to do odd
things (ISTR it sucked when modules oopsed on load, too).

So then we have the problem that crc32 is finished its init and needs the
lock back, and bnx2x which needs crc32 is waiting for it. We could just
fail bnx2x; and in fact, we did prior to this patch (we timeout) and it breaks
network on booting on some box according to Brandon:

http://www.mail-archive.com/linux-crypto@xxxxxxxxxxxxxxx/msg04331.html

This *used* not to be a problem, because userspace placed locks on
modules and so it would never try to load bnx2x until crc32 was loaded.
ISTR a mention that Jon removed that...

> Rewriting the logic to
> - not drop the lock
> - not change the return semantics of an exported interface
> - just make 'resolve_symbol()' fail if the module isn't fully loaded
> would seem to be a more reasonable approach, no?

Sure, then userspace needs to change :(

Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/