Re: [RFC PATCH 1/7] x86, mm: ZONE_DEVICE for "device memory"

From: Jerome Glisse
Date: Tue Aug 18 2015 - 15:06:46 EST


On Tue, Aug 18, 2015 at 10:23:38AM -0700, Dan Williams wrote:
> On Tue, Aug 18, 2015 at 9:55 AM, Jerome Glisse <j.glisse@xxxxxxxxx> wrote:
> > On Mon, Aug 17, 2015 at 05:46:43PM -0700, Dan Williams wrote:
> >> On Mon, Aug 17, 2015 at 2:45 PM, Jerome Glisse <j.glisse@xxxxxxxxx> wrote:
> >> > On Fri, Aug 14, 2015 at 07:11:27PM -0700, Dan Williams wrote:
> >> >> Although it does not offer perfect protection if device memory is at a
> >> >> physically lower address than RAM, skipping the update of these
> >> >> variables does seem to be what we want. For example /dev/mem would
> >> >> fail to allow write access to persistent memory if it fails a
> >> >> valid_phys_addr_range() check. Since /dev/mem does not know how to
> >> >> write to PMEM in a reliably persistent way, it should not treat a
> >> >> PMEM-pfn like RAM.
> >> >
> >> > So i attach is a patch that should keep ZONE_DEVICE out of consideration
> >> > for the buddy allocator. You might also want to keep page reserved and not
> >> > free inside the zone, you could replace the generic_online_page() using
> >> > set_online_page_callback() while hotpluging device memory.
> >> >
> >>
> >> Hmm, are we already protected by the fact that ZONE_DEVICE is not
> >> represented in the GFP_ZONEMASK?
> >
> > Yeah seems you right, high_zoneidx (which is derive using gfp_zone()) will
> > always limit which zones are considered. I thought that under memory presure
> > it would go over all of the zonelist entry and eventualy consider the device
> > zone. But it doesn't seems to be that way.
> >
> > Keeping the device zone out of the zonelist might still be a good idea, if
> > only to avoid pointless iteration for the page allocator. Unless someone can
> > think of a reason why this would be bad.
> >
>
> The other question I have is whether disabling ZONE_DMA is a realistic
> tradeoff for enabling ZONE_DEVICE? I.e. can ZONE_DMA default to off
> going forward, lose some ISA device support, or do we need to figure
> out how to enable > 4 zones.

That require some auditing a quick look and it seems to matter for s390
arch and there is still few driver that use it. I think we can forget
about ISA bus, i would be surprise if you could still run a recent kernel
on a computer that has ISA bus.

Thought maybe you don't need a new ZONE_DEV and all you need is valid
struct page for this device memory, and you don't want this page to be
useable by the general memory allocator. There is surely other ways to
achieve that like marking all as reserved when you hotplug them.

Cheers,
Jérôme
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/