Re: [-next] memory hotplug regression

From: Heiko Carstens
Date: Fri May 26 2017 - 08:25:57 EST


On Wed, May 24, 2017 at 10:39:57AM +0200, Michal Hocko wrote:
> On Wed 24-05-17 10:20:22, Heiko Carstens wrote:
> > Having the ZONE_MOVABLE default was actually the only point why s390's
> > arch_add_memory() was rather complex compared to other architectures.
> >
> > We always had this behaviour, since we always wanted to be able to offline
> > memory after it was brought online. Given that back then "online_movable"
> > did not exist, the initial s390 memory hotplug support simply added all
> > additional memory to ZONE_MOVABLE.
> >
> > Keeping the default the same would be quite important.
>
> Hmm, that is really unfortunate because I would _really_ like to get rid
> of the previous semantic which was really awkward. The whole point of
> the rework is to get rid of the nasty zone shifting.
>
> Is it an option to use `online_movable' rather than `online' in your setup?
> Btw. my long term plan is to remove the zone range constrains altogether
> so you could online each memblock to the type you want. Would that be
> sufficient for you in general?

Why is it a problem to change the default for 'online'? As far as I can see
that doesn't have too much to do with the order of zones, no?

By the way: we played around a bit with the changes wrt memory
hotplug. There are a two odd things:

1) With the new code I can generate overlapping zones for ZONE_DMA and
ZONE_NORMAL:

--- new code:

DMA [mem 0x0000000000000000-0x000000007fffffff]
Normal [mem 0x0000000080000000-0x000000017fffffff]

# cat /sys/devices/system/memory/block_size_bytes
10000000
# cat /sys/devices/system/memory/memory5/valid_zones
DMA
# echo 0 > /sys/devices/system/memory/memory5/online
# cat /sys/devices/system/memory/memory5/valid_zones
Normal
# echo 1 > /sys/devices/system/memory/memory5/online
Normal

# cat /proc/zoneinfo
Node 0, zone DMA
spanned 524288 <-----
present 458752
managed 455078
start_pfn: 0 <-----

Node 0, zone Normal
spanned 720896
present 589824
managed 571648
start_pfn: 327680 <-----

So ZONE_DMA ends within ZONE_NORMAL. This shouldn't be possible, unless
this restriction is gone?

--- old code:

# echo 0 > /sys/devices/system/memory/memory5/online
# cat /sys/devices/system/memory/memory5/valid_zones
DMA
# echo online_movable > /sys/devices/system/memory/memory5/state
-bash: echo: write error: Invalid argument
# echo online_kernel > /sys/devices/system/memory/memory5/state
-bash: echo: write error: Invalid argument
# echo online > /sys/devices/system/memory/memory5/state
# cat /sys/devices/system/memory/memory5/valid_zones
DMA


2) Another oddity is that after a memory block was brought online it's
association to ZONE_NORMAL or ZONE_MOVABLE seems to be fixed. Even if it
is brought offline afterwards:

# cat /sys/devices/system/memory/memory16/valid_zones
Normal Movable
# echo online_movable > /sys/devices/system/memory/memory16/state
# echo offline > /sys/devices/system/memory/memory16/state
# cat /sys/devices/system/memory/memory16/valid_zones
Movable <---- should be "Normal Movable"

I assume this happens because start_pfn and spanned pages of the zones
aren't updated if a memory block at the beginning or end of a zone is
brought offline.