RE: [EXTERNAL] Re: [PATCH] x86/Hyper-V: Support for free page reporting

From: Sunil Muthuswamy
Date: Fri May 22 2020 - 12:40:23 EST


> > + if (hv_do_hypercall(HV_EXT_CALL_QUERY_CAPABILITIES, NULL, cap) ==
> > + HV_STATUS_SUCCESS)
>
> You're using the input page as the output parameter. Ideally we should
> introduce hyperv_pcpu_output_arg page, but that would waste one page per
> cpu just for this one call.
>
> So for now I think this setup is fine, but I would like to add the
> following comment.
>
> /*
> * Repurpose the input_arg page to accept output from Hyper-V for
> * now because this is the only call that needs output from the
> * hypervisor. It should be fixed properly by introducing an
> * output_arg page once we have more places that require output.
> */

Sounds good. Will add it in v2.

> > +#ifdef CONFIG_PAGE_REPORTING
> > +static int hv_free_page_report(struct page_reporting_dev_info *pr_dev_info,
> > + struct scatterlist *sgl, unsigned int nents)
> > +{
> > + unsigned long flags;
> > + struct hv_memory_hint *hint;
> > + int i;
> > + u64 status;
> > + struct scatterlist *sg;
> > +
> > + WARN_ON(nents > HV_MAX_GPA_PAGE_RANGES);
>
> Should we return -ENOSPC here?

This is more of an assert because PAGE_REPORTING_CAPACITY is set to 32 and has
already been checked that it is < HV_MAX_GPA_PAGE_RANGES in enable_page_reporting.

> > + hint->type = HV_EXT_MEMORY_HEAT_HINT_TYPE_COLD_DISCARD;
> > + hint->reserved = 0;
> > + for (i = 0, sg = sgl; sg; sg = sg_next(sg), i++) {
> > + int order;
> > + union hv_gpa_page_range *range;
> > +
>
> Unfortunately I can't find the semantics of this hypercall in TLFS 6, so
> I have a few questions here.

This structure is not specific to this hypercall.

>
> > + order = get_order(sg->length);
> > + range = &hint->ranges[i];
> > + range->address_space = 0;
>
> I guess this means all address spaces?

'address_space' is being used here just as a zero initializer for the union. I think the use
of 'address_space' in the definition of hv_gpa_page_range is not appropriate. This struct is
defined in the TLFS 6.0 with the same name, if you want to look it up.

>
> > + range->page.largepage = 1;
>
> What effect does this have? What if the page is a 4k page?

Page reporting infrastructure doesn't report 4k pages today, but only huge pages (see
PAGE_REPORTING_MIN_ORDER in page_reporting.h). Additionally, the Hyper-V hypervisor
only supports reporting of 2M pages and above. The current code assumes that the minimum
order will be 9 i.e 2M pages and above.
If we feel that this could change in the future, or an implementation detail that should be
protected against, I can add some checks in hv_balloon.c. But, in that case, the ballon driver
should be able to query the page reporting min order somehow, which it is currently, since it is
private.
Alexander, do you have any suggestions or feedback here?

>
> > + range->page.additional_pages = (1ull << (order - 9)) - 1;
>
> What is 9 here? Is there a macro name *ORDER that you can use?

It is to determine the count of 2M pages. I will define a macro.

>
> Wei.