RE: [PATCH v2 0/2] Replace and improve "mcsafe" with copy_safe()

From: Luck, Tony
Date: Mon May 04 2020 - 16:05:16 EST


> When a copy function hits a bad page and the page is not yet known to
> be bad, what does it do? (I.e. the page was believed to be fine but
> the copy function gets #MC.) Does it unmap it right away? What does
> it return?

I suspect that we will only ever find a handful of situations where the
kernel can recover from memory that has gone bad that are worth fixing
(got to be some code path that touches a meaningful fraction of memory,
otherwise we get code complexity without any meaningful payoff).

I don't think we'd want different actions for the cases of "we just found out
now that this page is bad" and "we got a notification an hour ago that this
page had gone bad". Currently we treat those the same for application
errors ... SIGBUS either way[1].

-Tony

[1] well there are options both globally and at the per-process level to have
the "early" notifications delivered right away.