Re: [PATCH] SGI XPC fails to load when cpu 0 is out of IRQresources.

From: Andrew Morton
Date: Wed Aug 15 2012 - 18:56:04 EST


On Fri, 3 Aug 2012 14:46:29 -0500
Robin Holt <holt@xxxxxxx> wrote:

> On many of our larger systems, CPU 0 has had all of its IRQ resources
> consumed before XPC loads. Worse cases on machines with multiple
> 10 GigE cards and multiple IB cards have depleted the entire first
> socket of IRQs. That patch makes selecting the node upon which
> IRQs are allocated (as well as all the other GRU Message Queue
> structures) specifiable as a module load param and has a default
> behavior of searching all nodes/cpus for an available resource.
>

Is this problem serious enough to warrant a -stable backport? If you
want it to appear in vendor kernels then I guess "yes".

> +static int
> +xpc_init_mq_node(int nid)
> +{
> + int cpu;
> +
> + for_each_cpu(cpu, cpumask_of_node(nid)) {
> + xpc_activate_mq_uv = xpc_create_gru_mq_uv(XPC_ACTIVATE_MQ_SIZE_UV, nid,
> + XPC_ACTIVATE_IRQ_NAME,
> + xpc_handle_activate_IRQ_uv);
> + if (!IS_ERR(xpc_activate_mq_uv))
> + break;
> + }
> + if (IS_ERR(xpc_activate_mq_uv))
> + return PTR_ERR(xpc_activate_mq_uv);
> +
> + for_each_cpu(cpu, cpumask_of_node(nid)) {
> + xpc_notify_mq_uv = xpc_create_gru_mq_uv(XPC_NOTIFY_MQ_SIZE_UV, nid,
> + XPC_NOTIFY_IRQ_NAME,
> + xpc_handle_notify_IRQ_uv);
> + if (!IS_ERR(xpc_notify_mq_uv))
> + break;
> + }
> + if (IS_ERR(xpc_notify_mq_uv)) {
> + xpc_destroy_gru_mq_uv(xpc_activate_mq_uv);
> + return PTR_ERR(xpc_notify_mq_uv);
> + }
> +
> + return 0;
> +}

This seems to take the optimistic approach to CPU hotplug ;)
get_online_cpus(), perhaps?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/