[tip:sched/core] sched, net: Fixup busy_loop_us_clock()

From: tip-bot for Peter Zijlstra
Date: Mon Jan 13 2014 - 11:46:29 EST


Commit-ID: 37089834528be3ef8cbf927e47c753b3e272a856
Gitweb: http://git.kernel.org/tip/37089834528be3ef8cbf927e47c753b3e272a856
Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
AuthorDate: Tue, 19 Nov 2013 16:13:38 +0100
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Mon, 13 Jan 2014 17:39:11 +0100

sched, net: Fixup busy_loop_us_clock()

The only valid use of preempt_enable_no_resched() is if the very next
line is schedule() or if we know preemption cannot actually be enabled
by that statement due to known more preempt_count 'refs'.

This busy_poll stuff looks to be completely and utterly broken,
sched_clock() can return utter garbage with interrupts enabled (rare
but still) and it can drift unbounded between CPUs.

This means that if you get preempted/migrated and your new CPU is
years behind on the previous CPU we get to busy spin for a _very_ long
time.

There is a _REASON_ sched_clock() warns about preemptability -
papering over it with a preempt_disable()/preempt_enable_no_resched()
is just terminal brain damage on so many levels.

Replace sched_clock() usage with local_clock() which has a bounded
drift between CPUs (<2 jiffies).

There is a further problem with the entire busy wait poll thing in
that the spin time is additive to the syscall timeout, not inclusive.

Reviewed-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Signed-off-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: David S. Miller <davem@xxxxxxxxxxxxx>
Cc: rui.zhang@xxxxxxxxx
Cc: jacob.jun.pan@xxxxxxxxxxxxxxx
Cc: Mike Galbraith <bitbucket@xxxxxxxxx>
Cc: hpa@xxxxxxxxx
Cc: Arjan van de Ven <arjan@xxxxxxxxxxxxxxx>
Cc: lenb@xxxxxxxxxx
Cc: rjw@xxxxxxxxxxxxx
Cc: Eliezer Tamir <eliezer.tamir@xxxxxxxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Link: http://lkml.kernel.org/r/20131119151338.GF3694@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
include/net/busy_poll.h | 19 +------------------
1 file changed, 1 insertion(+), 18 deletions(-)

diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h
index 829627d..1d67fb6 100644
--- a/include/net/busy_poll.h
+++ b/include/net/busy_poll.h
@@ -42,27 +42,10 @@ static inline bool net_busy_loop_on(void)
return sysctl_net_busy_poll;
}

-/* a wrapper to make debug_smp_processor_id() happy
- * we can use sched_clock() because we don't care much about precision
- * we only care that the average is bounded
- */
-#ifdef CONFIG_DEBUG_PREEMPT
-static inline u64 busy_loop_us_clock(void)
-{
- u64 rc;
-
- preempt_disable_notrace();
- rc = sched_clock();
- preempt_enable_no_resched_notrace();
-
- return rc >> 10;
-}
-#else /* CONFIG_DEBUG_PREEMPT */
static inline u64 busy_loop_us_clock(void)
{
- return sched_clock() >> 10;
+ return local_clock() >> 10;
}
-#endif /* CONFIG_DEBUG_PREEMPT */

static inline unsigned long sk_busy_loop_end_time(struct sock *sk)
{
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/