Re: [PATCH 0/1] Fixup write permission of TLB on powerpc e500 core

From: Shan Hai
Date: Sat Jul 16 2011 - 11:03:49 EST

Next message: Jean Delvare: "Re: linux-next: build failure after merge of the jdelvare-hwmontree"
Previous message: Shan Hai: "Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 07/15/2011 08:20 PM, Benjamin Herrenschmidt wrote:

On Fri, 2011-07-15 at 11:32 -0400, Shan Hai wrote:
I agree with you, the problem could be triggered by accessing
any user space page which has kernel read only permission
in the page fault disabled context, the problem also affects
architectures which depend on SW dirty/young tracking as
stated by Benjamin in this thread.

In the e500 case, the commit 6cfd8990e27d3a491c1c605d6cbc18a46ae51fef
removed the write permission fixup from TLB miss handlers and left it to
generic code, so it might be right time to fixup the write permission here
in the generic code.

But we can't. The must not modify the PTE from an interrupt context and
the "atomic" variants of user accesses can be called in such contexts.

I think the problem is that we try to actually do things other than just
"peek" at user memory (for backtraces etc...) but actually useful things
in page fault disabled contexts. That's bad and various archs mm were
designed with the assumption that this never happens.

Yes I understood, the *here* above means 'generic code' like futex code,
I am sorry for my ambiguous description.

If the futex case is seldom here, we could probably find a way to work
around in that specific case.

That's what my patch wants to do.

However, I -still- don't understand why gup didn't fixup the write
permission. gup doesn't set dirty ?

Yep, gup doesn't set dirty, because when the page fault
occurs on the kernel accessing a user page which is
read only to the kernel the following conditions hold,
- the page is present, because its a shared page
- the page is writable, because demand paging
sets up the pte for the current process to so

The follow_page() called in the __get_user_page()
returns non NULL to its caller on the above mentioned
present and writable page, so the gup(.write=1) has no
chance to set pte dirty by calling handle_mm_fault

Thanks
Shan Hai
s

Cheers,
Ben.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Jean Delvare: "Re: linux-next: build failure after merge of the jdelvare-hwmontree"
Previous message: Shan Hai: "Re: [PATCH 1/1] Fixup write permission of TLB on powerpc e500 core"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]