RE: [PATCH V2] x86/Hyper-V: Balloon up according to request page number

From: Vitaly Kuznetsov
Date: Mon Jan 20 2020 - 05:21:46 EST


Michael Kelley <mikelley@xxxxxxxxxxxxx> writes:

> From: Tianyu Lan <Tianyu.Lan@xxxxxxxxxxxxx> Sent: Thursday, January 16, 2020 6:16 AM
>>
>> Current code has assumption that balloon request memory size aligns
>> with 2MB. But actually Hyper-V doesn't guarantee such alignment. When
>> balloon driver receives non-aligned balloon request, it produces warning
>> and balloon up more memory than requested in order to keep 2MB alignment.
>> Remove the warning and balloon up memory according to actual requested
>> memory size.
>>
>> Fixes: f6712238471a ("hv: hv_balloon: avoid memory leak on alloc_error of 2MB memory
>> block")
>> Cc: stable@xxxxxxxxxxxxxxx
>> Signed-off-by: Tianyu Lan <Tianyu.Lan@xxxxxxxxxxxxx>
>> ---
>> Change since v2:
>> - Change logic of switching alloc_unit from 2MB to 4KB
>> in the balloon_up() to avoid redundant iteration when
>> handle non-aligned page request.
>> - Remove 2MB alignment operation and comment in balloon_up()
>> ---
>> drivers/hv/hv_balloon.c | 12 ++++--------
>> 1 file changed, 4 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
>> index 7f3e7ab22d5d..536807efbc35 100644
>> --- a/drivers/hv/hv_balloon.c
>> +++ b/drivers/hv/hv_balloon.c
>> @@ -1684,7 +1684,7 @@ static unsigned int alloc_balloon_pages(struct
>> hv_dynmem_device *dm,
>> if (num_pages < alloc_unit)
>> return 0;
>
> The above test is no longer necessary. The num_pages < alloc_unit
> case is handled implicitly by your new 'for' loop condition.
>
>>
>> - for (i = 0; (i * alloc_unit) < num_pages; i++) {
>> + for (i = 0; i < num_pages / alloc_unit; i++) {
>> if (bl_resp->hdr.size + sizeof(union dm_mem_page_range) >
>> HV_HYP_PAGE_SIZE)
>> return i * alloc_unit;
>> @@ -1722,7 +1722,7 @@ static unsigned int alloc_balloon_pages(struct
>> hv_dynmem_device *dm,
>>
>> }
>>
>> - return num_pages;
>> + return i * alloc_unit;
>> }
>>
>> static void balloon_up(union dm_msg_info *msg_info)
>> @@ -1737,9 +1737,6 @@ static void balloon_up(union dm_msg_info *msg_info)
>> long avail_pages;
>> unsigned long floor;
>>
>> - /* The host balloons pages in 2M granularity. */
>> - WARN_ON_ONCE(num_pages % PAGES_IN_2M != 0);
>> -
>> /*
>> * We will attempt 2M allocations. However, if we fail to
>> * allocate 2M chunks, we will go back to PAGE_SIZE allocations.
>> @@ -1749,14 +1746,13 @@ static void balloon_up(union dm_msg_info *msg_info)
>> avail_pages = si_mem_available();
>> floor = compute_balloon_floor();
>>
>> - /* Refuse to balloon below the floor, keep the 2M granularity. */
>> + /* Refuse to balloon below the floor. */
>> if (avail_pages < num_pages || avail_pages - num_pages < floor) {
>> pr_warn("Balloon request will be partially fulfilled. %s\n",
>> avail_pages < num_pages ? "Not enough memory." :
>> "Balloon floor reached.");
>>
>> num_pages = avail_pages > floor ? (avail_pages - floor) : 0;
>> - num_pages -= num_pages % PAGES_IN_2M;
>> }
>>
>> while (!done) {
>> @@ -1770,7 +1766,7 @@ static void balloon_up(union dm_msg_info *msg_info)
>> num_ballooned = alloc_balloon_pages(&dm_device, num_pages,
>> bl_resp, alloc_unit);
>>
>> - if (alloc_unit != 1 && num_ballooned == 0) {
>> + if (alloc_unit != 1 && num_ballooned != num_pages) {
>
> Maybe I'm missing something, but I don't think Vitaly's optimization works. If
> alloc_unit specifies 2 Mbytes, and num_pages specifies 3 Mbytes, then num_ballooned
> will come back as 2 Mbytes, which is correct. But if we revert alloc_unit to 1 page and
> "continue" in that case, we will lose the 2 Mbytes of memory (it's not freed), and the
> next time through the loop will try to allocate only 1 Mbyte (because num_pages
> will be decremented by num_ballooned). I think the original code does the right thing.
>

True, nice catch! What I meant to say is that we can avoid sending two
replies to the host and get away with just one. Unfortunately,
"num_ballooned != num_pages" modification is not sufficient as
alloc_balloon_pages() will overwrite bl_resp->range_array[] (as it
always starts from 0). I think we can still achieve the goal by
allocating bl_resp outside of the loop and filling it from
range_array[range_count] instead of range_array[i] but it seems to big
of a change for now particular gain. Let's just drop the idea.


> Michael
>
>> alloc_unit = 1;
>> continue;
>> }
>> --
>> 2.14.5
>

--
Vitaly