Re: [PATCH] sched: schedule_raw_spin_unlock() andschedule_spin_unlock()

From: Steven Rostedt
Date: Mon Jun 17 2013 - 10:29:11 EST


On Fri, 2013-06-14 at 18:40 +0400, Kirill Tkhai wrote:
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 58453b8..381e493 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3125,6 +3125,30 @@ asmlinkage void __sched preempt_schedule_irq(void)
> exception_exit(prev_state);
> }
>
> +/*
> + * schedule_raw_spin_unlock - unlock raw_spinlock and call schedule()
> + *
> + * Should be used instead of the constructions
> + * 1) raw_spin_unlock_irq(lock);
> + * schedule();
> + * or
> + * 2) raw_spin_unlock_irqrestore(lock, flags);
> + * schedule();

Is there a place that does #2? If interrupts were disabled, the flags
would keep them disabled and that would not be good when calling
schedule.

> + * where they have to be.
> + *
> + * It helps to prevent excess preempt_schedule() during the unlocking,
> + * which can be called on preemptible kernel.
> + * Returns with irqs enabled.
> + */
> +void __sched schedule_raw_spin_unlock(raw_spinlock_t *lock)
> +{

I agree with the idea of adding this, but I don't like this
implementation. Also, if this is to enable interrupts, the name must
represent that:

schedule_raw_spin_unlock_irq()

You can't just enable them if they were not disabled. That will break
things like lockdep.


> + preempt_disable();
> + raw_spin_unlock_irq(lock);
> + sched_preempt_enable_no_resched();

This is the easy way of implementing this, but it does add a slight
overhead here. Adding and subtracting preempt count just to prevent the
disable does add a bit more computation. I've done this in the tracing
code, but its within the tracing and in an unlikely path. The overhead
is not common. But I can see this being in a fast path and not something
that we want to add overhead to.

The ideal solution is not the easy one. It is to introduce a real
schedule_raw_spin_unlock() that is a copy of raw_spin_unlock() that does
not call preempt_enable, but calls preempt_enable_no_resched() instead,
and then does the schedule.

-- Steve



> + schedule();
> +}
> +EXPORT_SYMBOL(schedule_raw_spin_unlock);
> +
> #endif /* CONFIG_PREEMPT */
>
> int default_wake_function(wait_queue_t *curr, unsigned mode, int wake_flags,


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/