Re: [PATCH v2] VMware Balloon driver

From: Dmitry Torokhov
Date: Wed Apr 21 2010 - 20:00:55 EST


On Wed, Apr 21, 2010 at 04:54:49PM -0700, Andrew Morton wrote:
> On Thu, 15 Apr 2010 14:00:31 -0700
> Dmitry Torokhov <dtor@xxxxxxxxxx> wrote:
>
> > This is standalone version of VMware Balloon driver. Ballooning is a
> > technique that allows hypervisor dynamically limit the amount of memory
> > available to the guest (with guest cooperation). In the overcommit
> > scenario, when hypervisor set detects that it needs to shuffle some memory,
> > it instructs the driver to allocate certain number of pages, and the
> > underlying memory gets returned to the hypervisor. Later hypervisor may
> > return memory to the guest by reattaching memory to the pageframes and
> > instructing the driver to "deflate" balloon.
> >
> > Signed-off-by: Dmitry Torokhov <dtor@xxxxxxxxxx>
> > ---
> >
> > Unlike previous version, that tried to integrate VMware ballooning transport
> > into virtio subsystem, and use stock virtio_ballon driver, this one implements
> > both controlling thread/algorithm and hypervisor transport.
> >
> > We are submitting standalone driver because KVM maintainer (Avi Kivity)
> > expressed opinion (rightly) that our transport does not fit well into
> > virtqueue paradigm and thus it does not make much sense to integrate
> > with virtio.
> >
> > There were also some concerns whether current ballooning technique is
> > the right thing. If there appears a better framework to achieve this we
> > are prepared to evaluate and switch to using it, but in the meantime
> > we'd like to get this driver upstream.
> >
> > Changes since v1:
> > - added comments throughout the code;
> > - exported stats moved from /proc to debugfs;
> > - better changelog.
> >
> >
> > ...
> >
> > +#define VMW_BALLOON_NOSLEEP_ALLOC_MAX 16384U
> > +
> > +#define VMW_BALLOON_RATE_ALLOC_MIN 512U
> > +#define VMW_BALLOON_RATE_ALLOC_MAX 2048U
> > +#define VMW_BALLOON_RATE_ALLOC_INC 16U
> > +
> > +#define VMW_BALLOON_RATE_FREE_MIN 512U
> > +#define VMW_BALLOON_RATE_FREE_MAX 16384U
> > +#define VMW_BALLOON_RATE_FREE_INC 16U
>
> hum. What do these do and what units are they in? Needs a comment?

These control inflating/deflating rate of the ballon, mesured in
pages/sec. I will add a comment.

>
> >
> > ...
> >
> > +#define VMWARE_BALLOON_CMD(cmd, data, result) \
> > +({ \
> > + unsigned long __stat, __dummy1, __dummy2; \
> > + __asm__ __volatile__ ("inl (%%dx)" : \
> > + "=a"(__stat), \
> > + "=c"(__dummy1), \
> > + "=d"(__dummy2), \
> > + "=b"(result) : \
> > + "0"(VMW_BALLOON_HV_MAGIC), \
> > + "1"(VMW_BALLOON_CMD_##cmd), \
> > + "2"(VMW_BALLOON_HV_PORT), \
> > + "3"(data) : \
> > + "memory"); \
> > + result &= -1UL; \
> > + __stat & -1UL; \
> > +})
>
> This is OK for both x86_32 and x86_64?

Yes it is.

>
> Was it actually intended that this driver be enabled for 32-bit?
>

Yes.

> > +#define STATS_INC(stat) (stat)++
> > +
> > +struct vmballoon_stats {
> > + unsigned int timer;
> > +
> > + /* allocation statustics */
> > + unsigned int alloc;
> > + unsigned int alloc_fail;
> > + unsigned int sleep_alloc;
> > + unsigned int sleep_alloc_fail;
> > + unsigned int refused_alloc;
> > + unsigned int refused_free;
> > + unsigned int free;
> > +
> > + /* monitor operations */
> > + unsigned int lock;
> > + unsigned int lock_fail;
> > + unsigned int unlock;
> > + unsigned int unlock_fail;
> > + unsigned int target;
> > + unsigned int target_fail;
> > + unsigned int start;
> > + unsigned int start_fail;
> > + unsigned int guest_type;
> > + unsigned int guest_type_fail;
> > +};
> > +
> > +struct vmballoon {
> > +
> > + /* list of reserved physical pages */
> > + struct list_head pages;
> > +
> > + /* transient list of non-balloonable pages */
> > + struct list_head refused_pages;
> > +
> > + /* balloon size in pages */
> > + unsigned int size;
> > + unsigned int target;
> > +
> > + /* reset flag */
> > + bool reset_required;
> > +
> > + /* adjustment rates (pages per second) */
> > + unsigned int rate_alloc;
> > + unsigned int rate_free;
> > +
> > + /* slowdown page allocations for next few cycles */
> > + unsigned int slow_allocation_cycles;
> > +
> > + /* statistics */
> > + struct vmballoon_stats stats;
> > +
> > + /* debugfs file exporting statistics */
> > + struct dentry *dbg_entry;
> > +
> > + struct sysinfo sysinfo;
> > +
> > + struct delayed_work dwork;
> > +};
>
> afaict all the stats stuff is useless if CONFIG_DEBUG_FS=n. Perhaps in
> that case the vmballoon.stats field should be omitted and STATS_INC
> be made a no-op?
>

OK, will do.

Thanks Andrew.

--
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/