Re: [HMM 12/15] mm/migrate: new memory migration helper for use with device memory v4

From: Jerome Glisse
Date: Fri Jul 14 2017 - 20:56:07 EST


On Fri, Jul 14, 2017 at 12:43:51PM -0700, Evgeny Baskakov wrote:
> On 7/13/17 1:16 PM, Jerome Glisse wrote:
> Hi Jerome,
>
> I have hit another kind of hang. Briefly, if a not yet allocated page faults
> on CPU during migration to device memory, any subsequent migration will fail
> for such page. Such a situation can trigger if a CPU page fault happens just
> immediately after migrate_vma() starts unmapping pages to migrate.
>
> Please find attached a reproducer based on the sample driver. In the
> hmm_test() function, an HMM_DMIRROR_MIGRATE request is triggered from a
> separate thread for not yet allocated pages (coming from malloc). In the
> same time, a HMM_DMIRROR_READ request is made for the same pages. This
> results in a sporadic app-side hang, because random number of pages never
> migrate to device memory.
>
> Note that if the pages are touched (initialized with data) prior to that,
> everything works as expected: all HMM_DMIRROR_READ and HMM_DMIRROR_MIGRATE
> requests eventually succeed. See comments in the hmm_test() function.
>

So pushed an updated hmm-next branch this should fix all issues you had.
Thought i am not sure about the test in this mail, all i see is that it
continously spit error messages but it does not hang (i let it run 20min
or so). Dunno if that is what expected. Let me know if this is still an
issue and if so what should be the expected output of this test program.

Cheers,
Jérôme