Re: v3.4-rc2 out-of-memory problems (was Re: 3.4-rc1 sticks-and-crashs)

From: Srivatsa S. Bhat
Date: Sat Apr 14 2012 - 16:50:38 EST


On 04/10/2012 05:25 AM, Linus Torvalds wrote:

> On Mon, Apr 9, 2012 at 4:25 PM, David Rientjes <rientjes@xxxxxxxxxx> wrote:
>>
>> You could that if you also turned the check for "ret == NOTIFY_OK" in
>> profile_handoff_task() into "ret & NOTIFY_OK" in your patch, otherwise you
>> get a double free from __put_task_struct() and oprofile.
>
> Why? NOTIFY_DONE is zero.
>
> I do agree that we *also* could do the "& NOTIFY_OK" and make it
> clearer that we're oring bits together. And we could document the
> stupid notifier interfaces to do this all, and just make the rules be
> *sane* when you have multiple notifiers.
>
> And sane rules would be either:
>
> - you always return an error return, and notifiers all return either
> 0 or a negative error number, and we stop on the first error and
> return that.
>
> - you return a bitmask, and we or all bits together (and we can
> certainly continue to have a "stop here" bit)
>


Even I think 'or'ing the bits makes more sense than returning the last
return value.

CPU hotplug and suspend/resume are two of the things that I know of,
that use notifiers quite a bit. However, neither of them actually care
about the exact return value - if it is an error return, no matter which
one or for what reason, they do the same error handling; and it works
for them. IOW, if we change the documented behaviour of notifiers to
return 'or' of all return values, that would continue to work well
with these users.

Of course, there are other users like profile_handoff_task() that do
care about exactly what the return value was, but I guess we can
gradually adapt such users to the better, saner rules for the notifier
return values, as you proposed.

> But the current notifier semantics are just insane. The whole "we
> return the last return value" is crazy. It's by definition a random
> number, since the whole point of notifiers is that there can be
> multiple, and they aren't "ordered". So the whole "last return value"
> is something I just look at and say: "Whoever designed that is a
> f*cking moron".
>

[...]

>
> Again, almost every notifier user has always been total crap. It's
> just a stupid abstraction.



> "Something happened". "Oh, ok".
>


Never saw such a concise and apt definition of notifiers before ;-)

However, unfortunately, what other better mechanism do we have, to
deal with things that affect stuff across multiple subsystems, like
some of the users mentioned above? Hmm...

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/