Re: [RFC,PATCH] loopback: calls netif_receive_skb() instead of netif_rx()

From: Eric Dumazet
Date: Tue Apr 01 2008 - 05:19:31 EST


Ingo Molnar a écrit :
* Eric Dumazet <dada1@xxxxxxxxxxxxx> wrote:

Problem is to check available space :

It depends on stack growing UP or DOWN, and depends on caller running on process stack, or softirq stack, or even hardirq stack.

ok - i wish such threads were on lkml so that everyone not just the netdev kabal can read it. It's quite ugly, but if we want to check stack free space i'd suggest for you to put a stack_can_recurse() call into arch/x86/kernel/process.c and offer a default __weak implementation in kernel/fork.c that always returns 0.

the rule on x86 should be something like this: on 4K stacks and 64-bit [which have irqstacks] free stack space can go as low as 25%. On 8K stacks [which doesnt have irqstacks but nests irqs] it should not go below 50% before falling back to the explicitly queued packet branch.

this way other pieces of kernel code code can choose between on-stack fast recursion and explicit iterators. Although i'm not sure i like the whole concept to begin with ...


Hi Ingo

I took the time to prepare a patch to implement arch_stack_can_recurse() as you suggested.

Thank you

[PATCH] x86 : arch_stack_can_recurse() introduction

Some paths in kernel would like to chose between on-stack fast recursion and explicit iterators.

One identified spot is in net loopback driver, where we can avoid netif_rx() and its slowdown if
sufficient stack space is available.

We introduce a generic arch_stack_can_recurse() which default to a weak function returning 0.

On x86 arch, we implement following logic :

32 bits and 4K stacks (separate irq stacks) : can use up to 25% of stack
64 bits, 8K stacks (separate irq stacks) : can use up to 25% of stack
32 bits and 8K stacks (no irq stacks) : can use up to 50% of stack

Example of use in drivers/net/loopback.c, function loopback_xmit()

if (arch_stack_can_recurse())
netif_receive_skb(skb); /* immediate delivery to stack */
else
netif_rx(skb); /* defer to softirq handling */

Signed-off-by: Eric Dumazet <dada1@xxxxxxxxxxxxx>


diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 0e613e7..6edc1d3 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -43,3 +43,25 @@ void arch_task_cache_init(void)
__alignof__(union thread_xstate),
SLAB_PANIC, NULL);
}
+
+
+/*
+ * Used to check if we can recurse without risking stack overflow
+ * Rules are :
+ * 32 bits and 4K stacks (separate irq stacks) : can use up to 25% of stack
+ * 64 bits, 8K stacks (separate irq stacks) : can use up to 25% of stack
+ * 32 bits and 8K stacks (no irq stacks) : can use up to 50% of stack
+ */
+#if defined(CONFIG_4KSTACKS) || defined(CONFIG_X86_64)
+# define STACK_RECURSE_LIMIT (THREAD_SIZE/4)
+#else
+# define STACK_RECURSE_LIMIT (THREAD_SIZE/2)
+#endif
+
+int arch_stack_can_recurse()
+{
+ unsigned long offset_stack = current_stack_pointer & (THREAD_SIZE - 1);
+ unsigned long avail_stack = offset_stack - sizeof(struct thread_info);
+
+ return avail_stack >= STACK_RECURSE_LIMIT;
+}
diff --git a/include/asm-x86/thread_info_64.h b/include/asm-x86/thread_info_64.h
index f23fefc..9a913c4 100644
--- a/include/asm-x86/thread_info_64.h
+++ b/include/asm-x86/thread_info_64.h
@@ -60,6 +60,9 @@ struct thread_info {
#define init_thread_info (init_thread_union.thread_info)
#define init_stack (init_thread_union.stack)

+/* how to get the current stack pointer from C */
+register unsigned long current_stack_pointer asm("rsp") __used;
+
static inline struct thread_info *current_thread_info(void)
{
struct thread_info *ti;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index ca720f0..445b8da 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1600,6 +1600,8 @@ union thread_union {
unsigned long stack[THREAD_SIZE/sizeof(long)];
};

+extern int arch_stack_can_recurse(void);
+
#ifndef __HAVE_ARCH_KSTACK_END
static inline int kstack_end(void *addr)
{
diff --git a/kernel/fork.c b/kernel/fork.c
index a19df75..cd5d1e1 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -136,6 +136,11 @@ void __attribute__((weak)) arch_task_cache_init(void)
{
}

+int __attribute__((weak)) arch_stack_can_recurse(void)
+{
+ return 0;
+}
+
void __init fork_init(unsigned long mempages)
{
#ifndef __HAVE_ARCH_TASK_STRUCT_ALLOCATOR