[PATCH 6/7] sched: Clean up preempt_enable_no_resched() abuse

From: Peter Zijlstra
Date: Wed Nov 20 2013 - 11:34:01 EST


The only valid use of preempt_enable_no_resched() is if the very next
line is schedule() or if we know preemption cannot actually be enabled
by that statement due to known more preempt_count 'refs'.

As to the busy_poll mess; that looks to be completely and utterly
broken, sched_clock() can return utter garbage with interrupts enabled
(rare but still), it can drift unbounded between CPUs, so if you get
preempted/migrated and your new CPU is years behind on the previous
CPU we get to busy spin for a _very_ long time. There is a _REASON_
sched_clock() warns about preemptability - papering over it with a
preempt_disable()/preempt_enable_no_resched() is just terminal brain
damage on so many levels.

Cc: Arjan van de Ven <arjan@xxxxxxxxxxxxxxx>
Cc: lenb@xxxxxxxxxx
Cc: rjw@xxxxxxxxxxxxx
Cc: Eliezer Tamir <eliezer.tamir@xxxxxxxxxxxxxxx>
Cc: Chris Leech <christopher.leech@xxxxxxxxx>
Cc: David S. Miller <davem@xxxxxxxxxxxxx>
Cc: rui.zhang@xxxxxxxxx
Cc: jacob.jun.pan@xxxxxxxxxxxxxxx
Cc: Mike Galbraith <bitbucket@xxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: hpa@xxxxxxxxx
Reviewed-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Signed-off-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
---
include/net/busy_poll.h | 20 ++++++++------------
net/ipv4/tcp.c | 4 ++--
2 files changed, 10 insertions(+), 14 deletions(-)

--- a/include/net/busy_poll.h
+++ b/include/net/busy_poll.h
@@ -42,27 +42,23 @@ static inline bool net_busy_loop_on(void
return sysctl_net_busy_poll;
}

-/* a wrapper to make debug_smp_processor_id() happy
- * we can use sched_clock() because we don't care much about precision
- * we only care that the average is bounded
- */
-#ifdef CONFIG_DEBUG_PREEMPT
static inline u64 busy_loop_us_clock(void)
{
u64 rc;

+ /*
+ * XXX with interrupts enabled sched_clock() can return utter garbage
+ * Futhermore, it can have unbounded drift between CPUs, so the below
+ * usage is terminally broken and only serves to shut up a valid debug
+ * warning.
+ */
+
preempt_disable_notrace();
rc = sched_clock();
- preempt_enable_no_resched_notrace();
+ preempt_enable_notrace();

return rc >> 10;
}
-#else /* CONFIG_DEBUG_PREEMPT */
-static inline u64 busy_loop_us_clock(void)
-{
- return sched_clock() >> 10;
-}
-#endif /* CONFIG_DEBUG_PREEMPT */

static inline unsigned long sk_busy_loop_end_time(struct sock *sk)
{
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1623,11 +1623,11 @@ int tcp_recvmsg(struct kiocb *iocb, stru
(len > sysctl_tcp_dma_copybreak) && !(flags & MSG_PEEK) &&
!sysctl_tcp_low_latency &&
net_dma_find_channel()) {
- preempt_enable_no_resched();
+ preempt_enable();
tp->ucopy.pinned_list =
dma_pin_iovec_pages(msg->msg_iov, len);
} else {
- preempt_enable_no_resched();
+ preempt_enable();
}
}
#endif


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/