Re: [PATCH 0/9] Hugepage migration (v2)

From: Andi Kleen
Date: Wed Aug 18 2010 - 03:46:40 EST


On Wed, Aug 18, 2010 at 04:32:34PM +0900, Naoya Horiguchi wrote:
> On Tue, Aug 17, 2010 at 11:40:08AM +0200, Andi Kleen wrote:
> > > When get_user_pages_fast() is called before try_to_unmap(),
> > > direct I/O code increments refcount on the target page.
> > > Because this refcount is not associated to the mapping,
> > > migration code will find remaining refcounts after try_to_unmap()
> > > unmaps all mappings. Then refcount check decides migration to fail,
> > > so direct I/O is continued safely.
> >
> > This would imply that direct IO can make migration fail arbitarily.
> > Also not good. Should we add some retries, at least for the soft offline
> > case?
>
> Soft offline is kicked from userspace, so the retry logic can be implemented
> in userspace. However, currently we can't distinguish migration failure from

I don't think user space is the right place for retry logic.
It doesn't really have enough information to make a good decision when
to reply.

Also I would consider requiring user space to work around kernel problems like
that bad design.


-Andi
--
ak@xxxxxxxxxxxxxxx -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/