Re: [PATCH] Raise maximum number of memory controllers

From: Mauro Carvalho Chehab
Date: Wed Sep 26 2018 - 12:04:19 EST


Em Wed, 26 Sep 2018 17:27:52 +0200
Borislav Petkov <bp@xxxxxxxxx> escreveu:

> On Wed, Sep 26, 2018 at 11:35:11AM +0200, Borislav Petkov wrote:
> > * or Greg coming and saying, you're using bus_type all wrong and you
> > shouldn't and you should remove it completely! :-)
>
> Yap, and so he did! :-)
>
> It looks like we can remove the whole per-MC bus thing, see below.

I guess this is/was needed to create things like this:

lrwxrwxrwx 1 root root 0 set 26 05:24 /sys/bus/edac/devices/mc -> ../../../devices/system/edac/mc

(I don't test this logic for a while).

> Patch
> seems to work here and grepping sysfs hierarchy before and after:
>
> find /sys/ | grep -i edac > ...
>
> doesn't show any differences.
>
> Tony, I'd appreciate running this on one of your big boxes to check
> nothing is missing in sysfs...

There are a few EDAC drivers for old server chipsets that create
error report mechanisms for PCI bus controllers. I don't remember
the specifics anymore. I *suspect* those are the drivers that
do such usage:

$ git grep -l edac_pci_add_device
drivers/edac/amd8111_edac.c
drivers/edac/amd8131_edac.c
drivers/edac/edac_pci.c
drivers/edac/edac_pci.h
drivers/edac/mpc85xx_edac.c
drivers/edac/mv64x60_edac.c
drivers/edac/octeon_edac-pci.c

None of them use Intel chipsets.

Last time I tried to test it (more than 5 years ago), it was
hard to find such systems at the labs I had access on that time.

Anyway, as this is a more unusual usage of EDAC API, it could be
worth to double-check if this patch won't break it, maybe doing
some test on such machines before/after this patch in order to
verify that this won't cause regressions there.

In thesis, it shouldn't affect, as the changes are happening at
edac_mc*.c, but I won't doubt that it might have a hidden dependency
somewhere.

>
> Thx.
>
> ---
> drivers/edac/edac_mc.c | 9 +--------
> drivers/edac/edac_mc_sysfs.c | 30 ++----------------------------
> include/linux/edac.h | 6 ------
> 3 files changed, 3 insertions(+), 42 deletions(-)
>
> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
> index 7d3edd713932..13594ffadcb3 100644
> --- a/drivers/edac/edac_mc.c
> +++ b/drivers/edac/edac_mc.c
> @@ -55,8 +55,6 @@ static LIST_HEAD(mc_devices);
> */
> static const char *edac_mc_owner;
>
> -static struct bus_type mc_bus[EDAC_MAX_MCS];
> -
> int edac_get_report_status(void)
> {
> return edac_report;
> @@ -716,11 +714,6 @@ int edac_mc_add_mc_with_groups(struct mem_ctl_info *mci,
> int ret = -EINVAL;
> edac_dbg(0, "\n");
>
> - if (mci->mc_idx >= EDAC_MAX_MCS) {
> - pr_warn_once("Too many memory controllers: %d\n", mci->mc_idx);
> - return -ENODEV;
> - }
> -
> #ifdef CONFIG_EDAC_DEBUG
> if (edac_debug_level >= 3)
> edac_mc_dump_mci(mci);
> @@ -760,7 +753,7 @@ int edac_mc_add_mc_with_groups(struct mem_ctl_info *mci,
> /* set load time so that error rate can be tracked */
> mci->start_time = jiffies;
>
> - mci->bus = &mc_bus[mci->mc_idx];
> + mci->bus = edac_get_sysfs_subsys();
>
> if (edac_create_sysfs_mci_device(mci, groups)) {
> edac_mc_printk(mci, KERN_WARNING,
> diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
> index 20374b8248f0..2ca2012f2857 100644
> --- a/drivers/edac/edac_mc_sysfs.c
> +++ b/drivers/edac/edac_mc_sysfs.c
> @@ -914,27 +914,8 @@ static const struct device_type mci_attr_type = {
> int edac_create_sysfs_mci_device(struct mem_ctl_info *mci,
> const struct attribute_group **groups)
> {
> - char *name;
> int i, err;
>
> - /*
> - * The memory controller needs its own bus, in order to avoid
> - * namespace conflicts at /sys/bus/edac.
> - */
> - name = kasprintf(GFP_KERNEL, "mc%d", mci->mc_idx);
> - if (!name)
> - return -ENOMEM;
> -
> - mci->bus->name = name;
> -
> - edac_dbg(0, "creating bus %s\n", mci->bus->name);
> -
> - err = bus_register(mci->bus);
> - if (err < 0) {
> - kfree(name);
> - return err;
> - }
> -
> /* get the /sys/devices/system/edac subsys reference */
> mci->dev.type = &mci_attr_type;
> device_initialize(&mci->dev);
> @@ -950,7 +931,7 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci,
> err = device_add(&mci->dev);
> if (err < 0) {
> edac_dbg(1, "failure: create device %s\n", dev_name(&mci->dev));
> - goto fail_unregister_bus;
> + goto out;
> }
>
> /*
> @@ -998,10 +979,8 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci,
> device_unregister(&dimm->dev);
> }
> device_unregister(&mci->dev);
> -fail_unregister_bus:
> - bus_unregister(mci->bus);
> - kfree(name);
>
> +out:
> return err;
> }
>
> @@ -1032,13 +1011,8 @@ void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
>
> void edac_unregister_sysfs(struct mem_ctl_info *mci)
> {
> - struct bus_type *bus = mci->bus;
> - const char *name = mci->bus->name;
> -
> edac_dbg(1, "Unregistering device %s\n", dev_name(&mci->dev));
> device_unregister(&mci->dev);
> - bus_unregister(bus);
> - kfree(name);
> }
>
> static void mc_attr_release(struct device *dev)
> diff --git a/include/linux/edac.h b/include/linux/edac.h
> index a45ce1f84bfc..dd8d006f1837 100644
> --- a/include/linux/edac.h
> +++ b/include/linux/edac.h
> @@ -668,10 +668,4 @@ struct mem_ctl_info {
> bool fake_inject_ue;
> u16 fake_inject_count;
> };
> -
> -/*
> - * Maximum number of memory controllers in the coherent fabric.
> - */
> -#define EDAC_MAX_MCS 16
> -
> #endif



Thanks,
Mauro