RE: [PATCH 2/4] Drivers: hv: balloon: account for gaps in hot add regions

From: Alex Ng (LIS)
Date: Sat Aug 06 2016 - 20:19:11 EST

> -----Original Message-----
> From: Vitaly Kuznetsov [mailto:vkuznets@xxxxxxxxxx]
> Sent: Friday, August 5, 2016 3:49 AM
> To: devel@xxxxxxxxxxxxxxxxxxxxxx
> Cc: linux-kernel@xxxxxxxxxxxxxxx; Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>;
> KY Srinivasan <kys@xxxxxxxxxxxxx>; Alex Ng (LIS) <alexng@xxxxxxxxxxxxx>
> Subject: [PATCH 2/4] Drivers: hv: balloon: account for gaps in hot add regions
> I'm observing the following hot add requests from the WS2012 host:
> hot_add_req: start_pfn = 0x108200 count = 330752
> hot_add_req: start_pfn = 0x158e00 count = 193536
> hot_add_req: start_pfn = 0x188400 count = 239616
> As the host doesn't specify hot add regions we're trying to create 128Mb-
> aligned region covering the first request, we create the 0x108000 -
> 0x160000 region and we add 0x108000 - 0x158e00 memory. The second
> request passes the pfn_covered() check, we enlarge the region to 0x108000 -
> 0x190000 and add 0x158e00 - 0x188200 memory. The problem emerges with
> the third request as it starts at 0x188400 so there is a 0x200 gap which is not
> covered. As the end of our region is 0x190000 now it again passes the
> pfn_covered() check were we just adjust the covered_end_pfn and make it
> 0x188400 instead of 0x188200 which means that we'll try to online
> 0x188200-0x188400 pages but these pages were never assigned to us and we
> crash.

The fact that the host sent a request that's non-contiguous with the previous
request is unexpected. Could we check to see the number of pages we returned
in our response, after each request?

I'm wondering if we may have given a wrong response to cause the host to
follow-up with a gapped request.