Re: [PATCH] sched: schedule_raw_spin_unlock() and schedule_spin_unlock()

From: Kirill Tkhai
Date: Mon Jun 17 2013 - 12:18:51 EST




17.06.2013, 18:29, "Steven Rostedt" <rostedt@xxxxxxxxxxx>:
> On Fri, 2013-06-14 at 18:40 +0400, Kirill Tkhai wrote:
>
>>  diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>>  index 58453b8..381e493 100644
>>  --- a/kernel/sched/core.c
>>  +++ b/kernel/sched/core.c
>>  @@ -3125,6 +3125,30 @@ asmlinkage void __sched preempt_schedule_irq(void)
>>           exception_exit(prev_state);
>>   }
>>
>>  +/*
>>  + * schedule_raw_spin_unlock - unlock raw_spinlock and call schedule()
>>  + *
>>  + * Should be used instead of the constructions
>>  + *   1) raw_spin_unlock_irq(lock);
>>  + *      schedule();
>>  + *   or
>>  + *   2) raw_spin_unlock_irqrestore(lock, flags);
>>  + *      schedule();
>
> Is there a place that does #2? If interrupts were disabled, the flags
> would keep them disabled and that would not be good when calling
> schedule.

Some drivers use _irqsave even if they are certainly enabled, I was thinking
about them. This is mistake, but people use.

Now I'm agree with you, I'll remove advice #2 to not multiply this errors.

>>  + * where they have to be.
>>  + *
>>  + * It helps to prevent excess preempt_schedule() during the unlocking,
>>  + * which can be called on preemptible kernel.
>>  + * Returns with irqs enabled.
>>  + */
>>  +void __sched schedule_raw_spin_unlock(raw_spinlock_t *lock)
>>  +{
>
> I agree with the idea of adding this, but I don't like this
> implementation. Also, if this is to enable interrupts, the name must
> represent that:
>
>   schedule_raw_spin_unlock_irq()
>
> You can't just enable them if they were not disabled. That will break
> things like lockdep.

Ok, and in addition may be useful schedule_raw_spin_unlock() for
use in cases like this happens in fs/*

>>  + preempt_disable();
>>  + raw_spin_unlock_irq(lock);
>>  + sched_preempt_enable_no_resched();
>
> This is the easy way of implementing this, but it does add a slight
> overhead here. Adding and subtracting preempt count just to prevent the
> disable does add a bit more computation. I've done this in the tracing
> code, but its within the tracing and in an unlikely path. The overhead
> is not common. But I can see this being in a fast path and not something
> that we want to add overhead to.
>
> The ideal solution is not the easy one. It is to introduce a real
> schedule_raw_spin_unlock() that is a copy of raw_spin_unlock() that does
> not call preempt_enable, but calls preempt_enable_no_resched() instead,
> and then does the schedule.

Ok

> -- Steve
>
>>  + schedule();
>>  +}
>>  +EXPORT_SYMBOL(schedule_raw_spin_unlock);
>>  +
>>   #endif /* CONFIG_PREEMPT */
>>
>>   int default_wake_function(wait_queue_t *curr, unsigned mode, int wake_flags,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/