Re: [PATCH] trace: Check for waiting readers before calling wakeup()

From: Vaibhav Nagarnaik
Date: Thu May 05 2011 - 02:06:23 EST


On Wed, May 4, 2011 at 8:53 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Wed, May 4, 2011 at 2:26 PM, Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
>> On Wed, May 04, 2011 at 02:21:28PM -0700, Vaibhav Nagarnaik wrote:
>>> Steven, Frederic
>>>
>>> Does this patch look ok?
>>>
>>> Vaibhav Nagarnaik
>>>
>>
>> No problem for me. I'll let Steve give the last word.
>
> Almost every time somebody does this optimization, there's a race.
>
> The thing is, because waitqueue_active() implies no barriers, if there
> is somebody adding itself to the waitqueue just at the same time as
> somebody is waking things up, the waiter may lose wakeup events -
> there is no memory barriers to guarantee that the waker will see
> either the waiter, or the waiter will see the event that it waited
> for.
>
> I dunno. It's sometimes safe, but it's equally often not safe unless
> there is some other locking around the thing. So I'd be a bit nervous.
> The tracer people would need to do some serious thinking about it.
>
>                   Linus
>

Thanks for looking at the patch and weighing in.

The problem I was trying to solve here is the contention when trying
to wake up a process. The process calling wakeup() has to take a
spinlock. When concurrent processes are getting traced, they contend
on locking the spinlock and cause large tracing latency. Having safe
locking around this code would defeat the purpose of this patch.

One could argue that if the trace reader does not get a wakeup() event
due to race, it will get it when the tracepoint is hit the next time.

Also, it seems the current tracers cannot rely on polling on the trace
file. The wakeup() is called only when the syscalls get traced. If any
other events are traced, the trace logging function (in
include/linux/syscalls.h) does not call wakeup(). This was done due to
hard lockup in the kernel if sched events were being traced.

I am new to this area, so Frederic and/or Steven could address your
concerns better.



Vaibhav Nagarnaik
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/