Re: [GIT PULL] device-dax for 5.1: PMEM as RAM

From: Dan Williams
Date: Mon Mar 11 2019 - 20:30:15 EST


On Mon, Mar 11, 2019 at 5:08 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Mon, Mar 11, 2019 at 8:37 AM Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
> >
> > Another feature the userspace tooling can support for the PMEM as RAM
> > case is the ability to complete an Address Range Scrub of the range
> > before it is added to the core-mm. I.e at least ensure that previously
> > encountered poison is eliminated.
>
> Ok, so this at least makes sense as an argument to me.
>
> In the "PMEM as filesystem" part, the errors have long-term history,
> while in "PMEM as RAM" the memory may be physically the same thing,
> but it doesn't have the history and as such may not be prone to
> long-term errors the same way.
>
> So that validly argues that yes, when used as RAM, the likelihood for
> errors is much lower because they don't accumulate the same way.
>
> > The driver can also publish an
> > attribute to indicate when rep; mov is recoverable, and gate the
> > hotplug policy on the result. In my opinion a positive indicator of
> > the cpu's ability to recover rep; mov exceptions is a gap that needs
> > addressing.
>
> Is there some way to say "don't raise MC for this region"? Or at least
> limit it to a nonfatal one?

I wish, but no. The poison consumption always raises the MC then it's
whether MCI_STATUS_PCC (processor context corrupt) is set as to
whether the cpu indicates it is safe to proceed. There's no way to
indicate, "never set MCI_STATUS_PCC", or silence the exception.