Re: [PATCH 2/4] nohz: Synchronize sleep time stats with seqlock

From: Frederic Weisbecker
Date: Fri Aug 16 2013 - 12:37:19 EST


2013/8/16 Frederic Weisbecker <fweisbec@xxxxxxxxx>:
> On Fri, Aug 16, 2013 at 06:02:01PM +0200, Oleg Nesterov wrote:
>> Thanks Frederic!
>>
>> I'll try to read this series carefully later. Not that I think
>> I can help, you certainly understand this much better.
>>
>> Just one question below,
>>
>> On 08/16, Frederic Weisbecker wrote:
>> >
>> > @@ -499,12 +509,15 @@ u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time)
>> > if (last_update_time)
>> > *last_update_time = ktime_to_us(now);
>> >
>> > - if (ts->idle_active && nr_iowait_cpu(cpu) > 0) {
>> > - ktime_t delta = ktime_sub(now, ts->idle_entrytime);
>> > - iowait = ktime_add(ts->iowait_sleeptime, delta);
>> > - } else {
>> > - iowait = ts->iowait_sleeptime;
>> > - }
>> > + do {
>> > + seq = read_seqcount_begin(&ts->sleeptime_seq);
>> > + if (ts->idle_active && nr_iowait_cpu(cpu) > 0) {
>> > + ktime_t delta = ktime_sub(now, ts->idle_entrytime);
>> > + iowait = ktime_add(ts->iowait_sleeptime, delta);
>> > + } else {
>> > + iowait = ts->iowait_sleeptime;
>> > + }
>> > + } while (read_seqcount_retry(&ts->sleeptime_seq, seq));
>>
>> Unless I missread this patch, this is still racy a bit.
>>
>> Suppose it is called on CPU_0 and cpu == 1. Suppose that
>> ts->idle_active == T and nr_iowait_cpu(cpu) == 1.
>>
>> So we return iowait_sleeptime + delta.
>>
>> Suppose that we call get_cpu_iowait_time_us() again. By this time
>> the task which incremented ->nr_iowait can be woken up on another
>> CPU, and it can do atomic_dec(rq->nr_iowait). So the next time
>> we return iowait_sleeptime, and this is not monotonic again.
>>
>> No?
>
> OTOH, io_schedule() does:
>
> atomic_inc(&rq->nr_iowait);
> schedule();
> atomic_dec(&rq->nr_iowait);
>
> How do we handle that when the task is migrated after it goes to sleep? I don't
> see the nr_iowait is decreased from the src CPU and increased on the dest CPU.
>
> I don't either see that iowait tasks can't be migrated.

My bad, The decrement happens on the src CPU anyway.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/