Re: [PATCH] EDAC: core EDAC support code

From: Dave Peterson
Date: Fri Mar 10 2006 - 16:11:41 EST


On Friday 10 March 2006 11:33, Arjan van de Ven wrote:
> > Regarding the actual call to run it, I guess it depends on which of
> > the following you prefer:
> >
> > Scenario A
> > ----------
> > A more decentralized layout. Here, the controls that govern the
> > error handling behavior for a given category of hardware (a
> > category might be "PCI devices" or "devices that use bus
> > technology XYZ") are grouped together with other stuff for that
> > category.
>
> this would basically make edac a place to report "help something went to
> the gutter". Sure. I see that useful.
>
> In fact there are 3 layers then
>
> 1) low level "do check" function

I'm a bit unclear here. Are you thinking of a single "do check"
function that bus-specific code and chipset-specific code can hook
into, or individual "do check" functions for various bus-specific and
chipset-specific modules?

> 2) per bus code that calls the do check functions and whatever is needed
> for bus checks
>
> 3) "EDAC" central command, which basically gathers all failure reports
> and does something with them (push them to userspace or implement the
> userspace chosen policy (panic/reboot/etc))

Are you suggesting something like the following?

- The controls that determine how the error checking is done are
located within the various hardware subsystems. For instance,
with PCI parity checking, this would include stuff like setting
the polling frequency and determining which devices to check.

- When an error is actually detected, the subsystem that detected
the error (for instance, PCI) would feed the error information
to EDAC. Then EDAC would determine how to respond to the error
(for instance, push it to userspace or implement the
userspace-chosen policy (panic/reboot/etc))
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/