Slow swapping even on fast infiniband

From: Spelic
Date: Fri Oct 29 2010 - 20:43:08 EST


Hello all lkml,

I have just set up two servers connected through iSCSI over Infiniband (SCST / SRP).

The "target" end exposes a ramdisk over SRP, the "initiator" end uses such device as a swap.
(I am trying to aggregate the memory of a few computers in order to perform computations not possible with the RAM of one only.)

This remote SRP disk is very fast, around 1 GByte/sec if I write or read to it using dd at bs=4K; from the initiator computer. So the IB is not the bottleneck.

However if I use such disk as a swap device on the "initiator" computer, I seem not able to obtain more than 150MB/sec reads + 150MB/sec writes from/to the swap
I can see these figures with iostat and I can roughly confirm them by the time it takes for my C++ memory-sweep-test to sweep all the RAM+swap for a few rounds.

Why kswapd is so slow?
Is there a way to do faster swapping of pages, such as with some kind of readahead or somehow swapping larger chunks together...?

I tweaked lots of settings in /sys/block/sdc/queue/ (scheduler, nr_requests, queue_depth), in the /proc/sys/vm/ (dirty_ratio, background ratio etc) but 150MB/sec is the most I could obtain.
Remember that this disk performs almost 1GB/sec in write and read tests with dd bs=4K. (The srp disk is being used as full device: no partitions, no LVM, no RAID.)
Will linux never be able to swap faster than this?

My kernel is 2.6.32 with just a few patches from the scst people.

Thanks for your help
Spelic
PS: please possibly keep me in CC if you reply because I am not subscribed to lkml. Ok I will also check via web. Thank you
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/