Re: [Xen-devel] [PATCH] xen: point xen_start_info to a dummy structfor PV on HVM guests

From: Stefano Stabellini
Date: Wed Oct 03 2012 - 12:21:23 EST


On Wed, 3 Oct 2012, Ian Campbell wrote:
> On Wed, 2012-10-03 at 15:11 +0100, Konrad Rzeszutek Wilk wrote:
> > On Wed, Oct 03, 2012 at 02:54:42PM +0100, Ian Campbell wrote:
> > > On Wed, 2012-10-03 at 14:51 +0100, Stefano Stabellini wrote:
> > > > On Wed, 3 Oct 2012, Ian Campbell wrote:
> > > > > On Wed, 2012-10-03 at 14:37 +0100, Stefano Stabellini wrote:
> > > > > > PV on HVM guests don't have a start_info page mapped by Xen, so
> > > > > > xen_start_info is just NULL for them.
> > > > > > That is problem because other parts of the code expect xen_start_info to
> > > > > > point to something valid, for example xen_initial_domain() is defined as
> > > > > > follow:
> > > > > >
> > > > > > #define xen_initial_domain() (xen_domain() && \
> > > > > > xen_start_info->flags & SIF_INITDOMAIN)
> > > > >
> > > > > But anyone who calls this before xen_start_info is setup is going to get
> > > > > a bogus result, specifically in this case they will think they are domU
> > > > > when in reality they are dom0 -- wouldn't it be better to fix those
> > > > > callsites?
> > > >
> > > > That cannot be the case because setting up xen_start_info is the very
> > > > first thing that is done, before even calling to C.
> > >
> > > On PV, yes, but you are trying to fix PVHVM here, no?
> > >
> > > Otherwise if this is always set before calling into C then what is the
> > > purpose of this patch?
> >
> > to fix this - as PVHVM has it set to NULL and we end up de-referencing
> > the xen_start_info and crashing. As so::
> >
>
> Right, so returning to my original point: The caller here is calling
> xen_initial_domain() *before* start info is setup. This is bogus and is
> your actual bug, all this patch does is hide that real issue.

That is because xen_start_info wasn't setup at all for PV on HVM guests.

The real reason is that PV on HVM guests don't have one, but that is
another matter. Until we get rid of all the references to xen_start_info
outside of PV specific code, we should just assume that there is one,
and that is already setup.

One day not too far from now, we might refactor the code to never
reference xen_start_info directly, but I don't think that now is the
time for that. Also consider that this is the same thing we do on ARM.


> With this "fix" the caller of xen_initial_domain shown in this trace now
> gets a rubbish result based on the content of a dummy shared info
> instead of the real answer from that actual shared info.

That is not true. The caller gets a zero result, that is completely
appropriate in this case, given that a PV on HVM guest doesn't have a
start_info.


> The right fix is to fix the caller to not call xen_initial_domain()
> until after the shared info has been setup. Maybe that means moving
> shinfo setup earlier, or maybe it means deferring this call until later
> in the PVHVM case.

I don't think so, we should be able to call xen_initial_domain() at any
point in the code.

The best course of action is taking this fix now (making PVHVM x86
guests behave the same way as ARM guests) and refactor all the callers to
xen_start_info later.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/