Re: [PATCH] move put_task_struct() reaping into a thread [Re:2.6.18-rt1]

From: Esben Nielsen
Date: Wed Sep 27 2006 - 16:29:33 EST

On Wed, 27 Sep 2006, Eric W. Biederman wrote:

Bill Huey (hui) <billh@xxxxxxxxxxxxxxxxx> writes:

On Tue, Sep 26, 2006 at 08:55:41PM -0600, Eric W. Biederman wrote:
Bill Huey (hui) <billh@xxxxxxxxxxxxxxxxx> writes:
This patch moves put_task_struct() reaping into a thread instead of an
RCU callback function as discussed with Esben publically and Ingo privately:

Stupid question.

It's a great question actually.

Why does the rt tree make all calls to put_task_struct an rcu action?
We only need the rcu callback from kernel/exit.c

Because the conversion of memory allocation routines like kmalloc and kfree aren't
safely callable within a preempt_disable critical section since they were incompletely
converted in the -rt. It can run into the sleeping in atomic scenario which can result
in a deadlock since those routines use blocking locks internally in the implementation
now as a result of the spinlock_t conversion to blocking locks.

Interesting. I think the easy solution would just be to assert that put_task_struct
can sleep and to fix any callers that expect differently. I haven't looked very
closely but I don't recall anything that needs put_task_struct to be atomic.
With a function that complex I certainly would not expect it to never sleep unless
it had a big fat comment.

Well I did find an instance where we call put_task_struct with a
spinlock held. Inside of lib/rwsem.c:rwsem_down_failed_common().

Still that may be the only user that cares. I suspect with a little
code rearrangement that case is fixable. It's not like that code is a
fast path or anything. It should just be a matter of passing the
task struct outside of the spinlock before calling put_task_struct.

Nothing else needs those semantics.

Right, blame it on the incomplete conversion of the kmalloc and friends. GFP_ATOMIC is
is kind of meaningless in the -rt tree and it might be a good thing to add something
like GFP_RT_ATOMIC for cases like this to be handled properly and restore that
particular semantic in a more meaningful way.

But this is a path where we are freeing data, so GFP_ATOMIC should not come
into it. I just read through the code and there are not allocations

I agree that put_task_struct is the most common point so this is unlikely
to remove your issues with rcu callbacks but it just seems completely backwards
to increase the number of rcu callbacks in the rt tree.

I'm not sure what mean here, but if you mean that you don't like the RCU API abuse then
I agree with you on that. However, Ingo disagrees and I'm not going to argue it with him.
Although, I'm not going stop you if you do. :)

What I was thinking is that rcu isn't terribly friendly to realtime
activities because it postpones work and can wind up with a lot of
work to do at some random time later which can be bad for latencies.

Only activities with no deadlines are postponed. And therefore RCU is good for the latencies of the application. No high-priority, low-latency task
should bother spend time traversing and freeing a complicated datastructure.
Defer that to some lower priority task.


So I was very surprised to see an rt patch making more things under
rcu protection. Especially as I have heard other developers worried
about rt issues discussing removing the rcu functionality.

My gut feel now that I understand the pieces is that this approach has
all of the hallmarks of a hack, both the kmalloc/kfree thing and even
more calling put_task_struct in an atomic context. If the callers
were fixed put_task_struct could safely sleep so kmalloc/kfree
sleeping would not be a problem.

That put_task_struct uses RCU is a hack to defer the job to a lower priority task. I think the right solution is to defer the job to a lower priority task using a cheaper mechanism. put_task_struct() is used from high priority tasks in the priority inheritance code and should only do the minimal job of defering the real work to another task.



To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at