Re: [PATCH v1 06/47] mtrr: add __arch_phys_wc_add()

From: Luis R. Rodriguez
Date: Fri Mar 27 2015 - 19:33:49 EST


On Fri, Mar 27, 2015 at 04:10:03PM -0700, Andy Lutomirski wrote:
> On Fri, Mar 27, 2015 at 4:04 PM, Luis R. Rodriguez <mcgrof@xxxxxxxx> wrote:
> > On Fri, Mar 27, 2015 at 02:23:16PM -0700, Andy Lutomirski wrote:
> >> On Fri, Mar 27, 2015 at 1:30 PM, Luis R. Rodriguez <mcgrof@xxxxxxxx> wrote:
> >> > On Fri, Mar 27, 2015 at 12:58:02PM -0700, Andy Lutomirski wrote:
> >> >> On Fri, Mar 27, 2015 at 12:53 PM, Luis R. Rodriguez <mcgrof@xxxxxxxx> wrote:
> >> >> > On Fri, Mar 20, 2015 at 04:48:46PM -0700, Andy Lutomirski wrote:
> >> >> >> On Fri, Mar 20, 2015 at 4:17 PM, Luis R. Rodriguez
> >> >> >> <mcgrof@xxxxxxxxxxxxxxxx> wrote:
> >> >> >> > From: "Luis R. Rodriguez" <mcgrof@xxxxxxxx>
> >> >> >> >
> >> >> >> > Ideally on systems using PAT we can expect a swift
> >> >> >> > transition away from MTRR. There can be a few exceptions
> >> >> >> > to this, one is where device drivers are known to exist
> >> >> >> > on PATs with errata, another situation is observed on
> >> >> >> > old device drivers where devices had combined MMIO
> >> >> >> > register access with whatever area they typically
> >> >> >> > later wanted to end up using MTRR for on the same
> >> >> >> > PCI BAR. This situation can still be addressed by
> >> >> >> > splitting up ioremap'd PCI BAR into two ioremap'd
> >> >> >> > calls, one for MMIO registers, and another for whatever
> >> >> >> > is desirable for write-combining -- in order to
> >> >> >> > accomplish this though quite a bit of driver
> >> >> >> > restructuring is required.
> >> >> >> >
> >> >> >> > Device drivers which are known to require large
> >> >> >> > amount of re-work in order to split ioremap'd areas
> >> >> >> > can use __arch_phys_wc_add() to avoid regressions
> >> >> >> > when PAT is enabled.
> >> >> >> >
> >> >> >> > For a good example driver where things are neatly
> >> >> >> > split up on a PCI BAR refer the infiniband qib
> >> >> >> > driver. For a good example of a driver where good
> >> >> >> > amount of work is required refer to the infiniband
> >> >> >> > ipath driver.
> >> >> >> >
> >> >> >> > This is *only* a transitive API -- and as such no new
> >> >> >> > drivers are ever expected to use this.
> >> >> >>
> >> >> >> What's the exact layout that this helps? I'm sceptical that this can
> >> >> >> ever be correct.
> >> >> >>
> >> >> >> Is there some awful driver that has a large ioremap that's supposed to
> >> >> >> contain multiple different memtypes?
> >> >> >
> >> >> > Yes, I cc'd you just now on one where I made changes on a driver which uses one
> >> >> > PCI with mixed memtypes and uses MTRR to hole in WC. A transition to
> >> >> > arch_phys_wc_add() is therefore not possible if PAT is enabled as it would
> >> >> > regress those drivers by making the MTRR WC hole trick non functional.
> >> >> > The changes are non trivial and so in this series I supplied changes on
> >> >> > one driver only to show the effort required. The other drivers which
> >> >> > required this were:
> >> >> >
> >> >> > Driver File
> >> >> > ------------------------------------------------------------
> >> >> > fusion drivers/message/fusion/mptbase.c
> >> >> > ivtv drivers/media/pci/ivtv/ivtvfb.c
> >> >> > ipath drivers/infiniband/hw/ipath/ipath_driver.c
> >> >> >
> >> >> > This series makes those drivers use __arch_phys_wc_add() more as a
> >> >> > transitory phase in hopes we can address the proper split as with the
> >> >> > atyfb illustrates. For ipath the changes required have a nice template
> >> >> > with the qib driver as they share very similar driver structure, the
> >> >> > qib driver *did* do the nice split.
> >> >> >
> >> >> >> If so, can we ioremap + set_page_xyz instead?
> >> >> >
> >> >> > I'm not sure I see which call we'd use. Care to provide an example patch
> >> >> > alternative for the atyfb as a case in point alternative to the work required
> >> >> > to do the split?
> >> >> >
> >> >>
> >> >> I'm still confused. Would it be insufficient to ioremap_nocache the
> >> >> whole thing and then call set_memory_wc on parts of it? (Sorry,
> >> >> set_page_xyz was a typo.)
> >> >
> >> > I think that would be a sexy alternative.
> >> >
> >> > In this driver's case the thing is a bit messy as it not only used
> >> > the WC MTRR for a hole but it also then used a UC MTRR on top of
> >> > it all, so since I already tried to address the split, and if we address
> >> > the power of 2 woes, I think it'd be best to try to remove the UC MTRR
> >> > and just avoid set_page_wc() in this driver's case, but for the other cases
> >> > (fusion, ivtv, ipath) I think this makes sense.
> >> >
> >> > Thoughts?
> >>
> >> Once that WC MTRR is in place, I think you really need UC and not UC-
> >> if you want to override it. Otherwise I agree with all of this.
> >
> > Do you mean that the UC MTRR work around that was in place might not
> > have really been effective? Not quite sure what you mean. I don't think
> > I follow.
>
> I mean that the UC MTRR that overrides the WC MTRR was probably fine
> (I hope smaller MTRRs override larger MTRRs). But we should just
> ditch UC MTRRs entirely,

Agreed, this series does that, and this patch addresses the last
UC MTRR ;)

> and setting UC in the page tables would work on all CPUs *if we supported
> that*. We'd need to add a couple trivial helpers to do that.

OK please check my latest reply and if you do not mind clarify what you mean
there as I am not sure if we're on the same page (no pun) here. I don't quite
follow this last statement.

Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/