Re: [RFC 0/9] mce recovery for Sandy Bridge server

From: Borislav Petkov
Date: Tue May 24 2011 - 13:33:33 EST


On Tue, May 24, 2011 at 09:57:46AM -0700, Luck, Tony wrote:
> So can we talk about this part for a while before returning to the
> "how to report this" discussion?
>
> So here's the situation - we are in the NMI handler when we find from
> looking at the machine check bank registers that we have a recoverable
> error. We know the physical address, and we know the task (which might
> have been in user or kernel context). I can package that information
> into a perf/event ... but then how can I mark the current task as
> not-fit-for-execution?

Maybe something like

set_current_state(TASK_UNINTERRUPTIBLE);

finish work in NMI context

do remaining work in process context like sending appropriate signals
etc; finally:

set_task_state(tsk, TASK_RUNNING)

?

--
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/