Re: rq_affinity doesn't seem to work?

From: Jens Axboe
Date: Tue Jul 12 2011 - 16:30:41 EST


On 2011-07-12 21:03, Jiang, Dave wrote:
> Jens,
> I'm doing some performance tuning for the Intel isci SAS controller
> driver, and I noticed some interesting numbers with mpstat. Looking at
> the numbers it seems that rq_affinity is not moving the request
> completion to the request submission CPU. Using fio to saturate the
> system with 512B I/Os, I noticed that all I/Os are bound to the CPUs
> (CPUs 6 and 7) that service the hard irqs. I have put in a quick hack
> in the driver so that it records the CPU during request construction
> and then I try to steer the scsi->done() calls to the request CPUs.
> With this simple hack, mpstat shows that the soft irq contexts are now
> distributed. I observed significant performance increase. The iowait%
> gone from 30s and 40s to low single digit approaching 0. Any ideas
> what could be happening with the rq_affinity logic? I'm assuming
> rq_affinity should behave the way my hacked solution is behaving. This
> is running on an 8 core single CPU SandyBridge based system with
> hyper-threading turned off. The two MSIX interrupts on the controller
> are tied to CPU 6 and 7 respectively via /proc/irq/X/smp_affinity. I'm
> running fio with 8 SAS disks and 8 threads.

It's probably the grouping, we need to do something about that. Does the
below patch make it behave as you expect?

diff --git a/block/blk.h b/block/blk.h
index d658628..17d53d8 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -157,6 +157,7 @@ static inline int queue_congestion_off_threshold(struct request_queue *q)

static inline int blk_cpu_to_group(int cpu)
{
+#if 0
int group = NR_CPUS;
#ifdef CONFIG_SCHED_MC
const struct cpumask *mask = cpu_coregroup_mask(cpu);
@@ -168,6 +169,7 @@ static inline int blk_cpu_to_group(int cpu)
#endif
if (likely(group < NR_CPUS))
return group;
+#endif
return cpu;
}


--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/