Re: [v7 0/8] Reduce cross CPU IPI interference

From: Peter Zijlstra
Date: Fri Feb 10 2012 - 13:38:33 EST


On Thu, 2012-02-02 at 10:41 -0500, Chris Metcalf wrote:
>
> /*
> * Quiesce the timer interrupt before returning to user space after a
> * system call. Normally if a task on a dataplane core makes a
> * syscall, the system will run one or more timer ticks after the
> * syscall has completed, causing unexpected interrupts in userspace.
> * Setting DP_QUIESCE avoids that problem by having the kernel "hold"
> * the task in kernel mode until the timer ticks are complete. This
> * will make syscalls dramatically slower.
> *
> * If multiple dataplane tasks are scheduled on a single core, this
> * in effect silently disables DP_QUIESCE, which allows the tasks to make
> * progress, but without actually disabling the timer tick.
> */
> #define DP_QUIESCE 0x1

This is what Frederics work does

>
> /*
> * Disallow the application from entering the kernel in any way,
> * unless it calls set_dataplane() again without this bit set.
> * Issuing any other syscall or causing a page fault would generate a
> * kernel message, and "kill -9" the process.
> *
> * Setting this flag automatically sets DP_QUIESCE as well.
> */
> #define DP_STRICT 0x2

This is a debug feature.. you'd better know what your own software does.

>
> /*
> * Debug dataplane interrupts, so that if any interrupt source
> * attempts to involve a dataplane cpu, a kernel message and stack
> * backtrace will be generated on the console. As this warning is a
> * slow event, it may make sense to avoid this mode in production code
> * to avoid making any possible interrupts even more heavyweight.
> *
> * Setting this flag automatically sets DP_QUIESCE as well.
> */
> #define DP_DEBUG 0x4

This too is a debug feature, one that doesn't cover all possible
scenarios.

> /*
> * Cause all memory mappings to be populated in the page table.
> * Specifying this when entering dataplane mode ensures that no future
> * page fault events will occur to cause interrupts into the Linux
> * kernel, as long as no new mappings are installed by mmap(), etc.
> * Note that since the hardware TLB is of finite size, there will
> * still be the potential for TLB misses that the hypervisor handles,
> * either via its software TLB cache (fast path) or by walking the
> * kernel page tables (slow path), so touching large amounts of memory
> * will still incur hypervisor interrupt overhead.
> */
> #define DP_POPULATE 0x8

map()s MAP_POPULATE will pre-populate the stuff for you, as will
mlock(), the latter will (mostly) ensure they stay around.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/