Re: [PATCH 1/1] ib_srp: Infiniband srp fast failover patch.

From: Karandeep Chahal
Date: Wed May 30 2012 - 10:41:25 EST


Hi Dave,

As long as we get faster failover I am happy with Bart's patch.

Currently when I run IO to several luns over multipath and the preferred path goes down, the system hangs until the IO fails over. Even ssh'ing into the systems take 20-30 seconds. I *suspect* that is because IO is being queued up somewhere which brings the whole system to its knees.

Thank you for looking at the patch.

Thanks
Karan

On 05/30/2012 01:06 AM, David Dillow wrote:
On Tue, 2012-05-29 at 17:07 -0400, Karandeep Chahal wrote:
Subject: [PATCH] Infiniband srp fast failover patch.
This conflicts with Bart's patches to improve failover; it will be much
better to use his approach to block the target rather than remove it
wholesale -- we could have lost connectivity as a transient and may get
it back quickly if someone grabbed the wrong cable, etc.

Also, we should only kill the one target on DREQ, and we already have a
pointer to it from the CM context -- no need to search.

It is a good idea to hook into the event mechanism; this is something
I've long wanted to incorporate (as Vu did in OFED). I'm looking at
getting Bart's series to a point I can merge it, and I'll pull in your
ideas -- with credit -- there.

Thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/