[PATCH] sgi-xp: eliminate false detection of no heartbeat

From: Dean Nelson
Date: Thu Jan 08 2009 - 09:53:44 EST


After XPC has been up and running on multiple partitions for any length of
time, if XPC on one of the partitions is stopped and restarted (either by a
rmmod/insmod or a system restart), it is possible for the XPCs running on
the other partitions to falsely detect a lack of heartbeat from the XPC that
was just restarted. This false detection will occur if the restarted XPC comes
up within the five-seconds preceding one of the other XPC's heartbeat check
(which occurs once every twenty seconds).

The detection of no heartbeat results in the detecting XPC deactivating from
the just restarted XPC. The only remedy is to restart one of the XPCs and
hope that one doesn't hit this five-second window on any of the other
partitions.

Signed-off-by: Dean Nelson <dcn@xxxxxxx>
Signed-off-by: Robin Holt <holt@xxxxxxx>
Cc: stable <stable@xxxxxxxxxx>

---

drivers/misc/sgi-xp/xpc_sn2.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux/drivers/misc/sgi-xp/xpc_sn2.c
===================================================================
--- linux.orig/drivers/misc/sgi-xp/xpc_sn2.c 2008-12-30 12:01:18.000000000 -0600
+++ linux/drivers/misc/sgi-xp/xpc_sn2.c 2009-01-08 08:12:01.000000000 -0600
@@ -899,7 +899,7 @@ xpc_update_partition_info_sn2(struct xpc
dev_dbg(xpc_part, " remote_vars_pa = 0x%016lx\n",
part_sn2->remote_vars_pa);

- part->last_heartbeat = remote_vars->heartbeat;
+ part->last_heartbeat = remote_vars->heartbeat - 1;
dev_dbg(xpc_part, " last_heartbeat = 0x%016lx\n",
part->last_heartbeat);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/