Re: [PATCHv7 3/3] vhost_net: a kernel-level virtio server
From: Michael S. Tsirkin
Date: Wed Nov 04 2009 - 11:44:43 EST
On Wed, Nov 04, 2009 at 03:41:47PM +0200, Michael S. Tsirkin wrote:
> On Wed, Nov 04, 2009 at 02:37:28PM +0100, Andi Kleen wrote:
> > On Wed, Nov 04, 2009 at 03:17:36PM +0200, Michael S. Tsirkin wrote:
> > > On Wed, Nov 04, 2009 at 02:15:33PM +0100, Andi Kleen wrote:
> > > > On Wed, Nov 04, 2009 at 03:08:28PM +0200, Michael S. Tsirkin wrote:
> > > > > On Wed, Nov 04, 2009 at 01:59:57PM +0100, Andi Kleen wrote:
> > > > > > > Fine?
> > > > > >
> > > > > > I cannot say -- are there paths that could drop the device beforehand?
> > > > >
> > > > > Do you mean drop the mm reference?
> > > >
> > > > No the reference to the device, which owns the mm for you.
> > >
> > > The device is created when file is open and destroyed
> > > when file is closed. So I think the fs code handles the
> > > reference counting for me: it won't call file cleanup
> > > callback while some userspace process has the file open.
> > > Right?
> > Yes.
> > But the semantics when someone inherits such a fd through exec
> > or through file descriptor passing would be surely "interesting"
> > You would still do IO on the old VM.
> > I guess it would be a good way to confuse memory accounting schemes
> > or administrators @)
> > It would be all saner if this was all a single atomic step.
> > -Andi
> I have this atomic actually. A child process will first thing
> do SET_OWNER: this is required before any other operation.
> SET_OWNER atomically (under mutex) does two things:
> - check that there is no other owner
> - get mm and set current process as owner
> I hope this addresses your concern?
Andrea, since you looked at this design at the early stages,
maybe you can provide feedback on the following question:
vhost has an ioctl to do get_task_mm and store the mm in per-file device
structure. mmput is called when file is closed. vhost is careful not
to reference the mm after is has been put. There is also an atomic
mutual exclusion mechanism to ensure that vhost does not allow one
process to access another's mm, even if they share a vhost file
descriptor. But, this still means that mm structure can outlive the
task if the file descriptor is shared with another process.
Other drivers, such as kvm, have the same property.
Do you think this is OK?
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/