Re: Buggy IPI and MTRR code on low memory

From: Rusty Russell
Date: Wed Jan 28 2009 - 18:26:17 EST


On Thursday 29 January 2009 03:08:14 Steven Rostedt wrote:
>
> While developing the RT git tree I came across this deadlock.
>
> To avoid touching the memory allocator in smp_call_function_many I forced
> the stack use case, the path that would be taken if data fails to
> allocate.
>
> Here's the current code in kernel/smp.c:

Interesting. I simplified smp_call_function_ma{sk,ny}, and introduced this bug (see 54b11e6d57a10aa9d0009efd93873e17bffd5d30).

We used to wait on OOM, yes, but we didn't do them one at a time.

We could restore that quiesce code, or call a function on all online cpus using on-stack data, and have them atomic_dec a counter when they're done (I'm not sure why we didn't do this in the first place: Nick?)

> The problem is that if we use the stack, then we must wait for the
> function to finish. But in the mtrr code, the called functions are waiting
> for the caller to do something after the smp_call_function. Thus we
> deadlock! This mtrr code seems to have been there for a while. At least
> longer than the git history.

I don't see how the *ever* worked then, even with the quiesce stuff.

> The patch creates another flag called CSD_FLAG_RELEASE. If we fail
> to alloc the data and the wait bit is not set, we still use the stack
> but we also set this flag instead of the wait flag. The receiving IPI
> will copy the data locally, and if this flag is set, it will clear it. The
> caller, after sending the IPI, will wait on this flag to be cleared.

Doesn't this break with more than one cpus? I think a refcnt is needed for the general case...

Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/