Re: [PATCH v10 3/6] fs/proc/task_mmu: Implement IOCTL to get and/or the clear info about PTEs

From: Muhammad Usama Anjum
Date: Thu Feb 23 2023 - 01:44:23 EST


On 2/22/23 4:48 PM, Michał Mirosław wrote:
> On Wed, 22 Feb 2023 at 12:06, Muhammad Usama Anjum
> <usama.anjum@xxxxxxxxxxxxx> wrote:
>>
>> On 2/22/23 3:44 PM, Michał Mirosław wrote:
>>> On Wed, 22 Feb 2023 at 11:11, Muhammad Usama Anjum
>>> <usama.anjum@xxxxxxxxxxxxx> wrote:
>>>> On 2/21/23 5:42 PM, Michał Mirosław wrote:
>>>>> On Tue, 21 Feb 2023 at 11:28, Muhammad Usama Anjum
>>>>> <usama.anjum@xxxxxxxxxxxxx> wrote:
>>>>>>
>>>>>> Hi Michał,
>>>>>>
>>>>>> Thank you so much for comment!
>>>>>>
>>>>>> On 2/17/23 8:18 PM, Michał Mirosław wrote:
>>>>> [...]
>>>>>>> For the page-selection mechanism, currently required_mask and
>>>>>>> excluded_mask have conflicting
>>>>>> They are opposite of each other:
>>>>>> All the set bits in required_mask must be set for the page to be selected.
>>>>>> All the set bits in excluded_mask must _not_ be set for the page to be
>>>>>> selected.
>>>>>>
>>>>>>> responsibilities. I suggest to rework that to:
>>>>>>> 1. negated_flags: page flags which are to be negated before applying
>>>>>>> the page selection using following masks;
>>>>>> Sorry I'm unable to understand the negation (which is XOR?). Lets look at
>>>>>> the truth table:
>>>>>> Page Flag negated_flags
>>>>>> 0 0 0
>>>>>> 0 1 1
>>>>>> 1 0 1
>>>>>> 1 1 0
>>>>>>
>>>>>> If a page flag is 0 and negated_flag is 1, the result would be 1 which has
>>>>>> changed the page flag. It isn't making sense to me. Why the page flag bit
>>>>>> is being fliped?
>>>>>>
>>>>>> When Anrdei had proposed these masks, they seemed like a fancy way of
>>>>>> filtering inside kernel and it was straight forward to understand. These
>>>>>> masks would help his use cases for CRIU. So I'd included it. Please can you
>>>>>> elaborate what is the purpose of negation?
>>>>>
>>>>> The XOR is a way to invert the tested value of a flag (from positive
>>>>> to negative and the other way) without having the API with invalid
>>>>> values (with required_flags and excluded_flags you need to define a
>>>>> rule about what happens if a flag is present in both of the masks -
>>>>> either prioritise one mask over the other or reject the call).
>>>> At minimum, one mask (required, any or excluded) must be specified. For a
>>>> page to get selected, the page flags must fulfill the criterion of all the
>>>> specified masks.
>>>
>>> [Please see the comment below.]
>>>
>>> [...]
>>>> Lets translate words into table:
>>> [Yes, those tables captured the intent correctly.]
>>>
>>>>> BTW, I think I assumed that both conditions (all flags in
>>>>> required_flags and at least one in anyof_flags is present) need to be
>>>>> true for the page to be selected - is this your intention?
>>>> All the masks are optional. If all or any of the 3 masks are specified, the
>>>> page flags must pass these masks to get selected.
>>>
>>> This explanation contradicts in part the introductory paragraph, but
>>> this version seems more useful as you can pass all masks zero to have
>>> all pages selected.
>> Sorry, I wrote it wrongly. (All the masks are not optional.) Let me
>> rephrase. All or at least any 1 of the 3 masks (required, any, exclude)
>> must be specified. The return_mask must always be specified. Error is
>> returned if all 3 masks (required, anyof, exclude) are zero or return_mask
>> is zero.
>
> Why do you need those restrictions? I'd guess it is valid to request a
> list of all pages with zero return_mask - this will return a compact
> list of used ranges of the virtual address space.
At the time, we are supporting 4 flags (PAGE_IS_WRITTEN, PAGE_IS_FILE,
PAGE_IS_PRESENT and PAGE_IS_SWAPPED). The idea is that user mention his
flags of interest in the return_mask. If he wants only 1 flag, he'll
specify it. Definitely if user wants only 1 flag, initially it doesn't make
any sense to mention in the return mask. But we want uniformity. If user
want, 2 or more flags in returned, return_mask becomes compulsory. So to
keep things simple and generic for any number of flags of interest
returned, the return_mask must be specified even if the flag of interest is
only 1.

>
>>>> After taking a while to understand this and compare with already present
>>>> flag system, `negated flags` is comparatively difficult to understand while
>>>> already present flags seem easier.
>>>
>>> Maybe replacing negated_flags in the API with matched_values =
>>> ~negated_flags would make this better?
>>>
>>> We compare having to understand XOR vs having to understand ordering
>>> of required_flags and excluded_flags.
>> There is no ordering in current masks scheme. No mask is preferable. For a
>> page to get selected, all the definitions of the masks must be fulfilled.
>> You have come up with good example that what if required_mask =
>> exclude_mask. In this case, no page will fulfill the criterion and hence no
>> page would be selected. It is user's fault that he isn't understanding the
>> definitions of these masks correctly.
>>
>> Now thinking about it, I can add a error check which would return error if
>> a bit in required and excluded masks matches. Would you like it? Lets put
>> this check in place.
>> (Previously I'd left it for user's wisdom not to do this. If he'll specify
>> same masks in them, he'll get no addresses out of the syscall.)
>
> This error case is (one of) the problems I propose avoiding. You also
> need much more text to describe the requred/excluded flags
> interactions and edge cases than saying that a flag must have a value
> equal to corresponding bit in ~negated_flags to be matched by
> requried/anyof masks.
I've found excluded_mask very intuitive as compared to negated_mask which
is so difficult to understand that I don't know how to use it correctly.
Lets take an example, I want pages which are PAGE_IS_WRITTEN and are not
PAGE_IS_FILE. In addition, the pages must be PAGE_IS_PRESENT or
PAGE_IS_SWAPPED. This can be specified as:

required_mask = PAGE_IS_WRITTEN
excluded_mask = PAGE_IS_FILE
anyof_mask = PAGE_IS_PRESETNT | PAGE_IS_SWAP

(a) assume page_flags = 0b1111
skip page as 0b1111 & 0b0010 = true

(b) assume page_flags = 0b1001
select page as 0b1001 & 0b0010 = false

It seemed intuitive. Right? How would you achieve same thing with negated_mask?

required_mask = PAGE_IS_WRITTEN
negated_mask = PAGE_IS_FILE
anyof_mask = PAGE_IS_PRESETNT | PAGE_IS_SWAP

(1) assume page_flags = 0b1111
tested_flags = 0b1111 ^ 0b0010 = 0b1101

(2) assume page_flags = 0b1001
tested_flags = 0b1001 ^ 0b0010 = 0b1011

In (1), we wanted to skip pages which have PAGE_IS_FILE set. But
negated_mask has just masked it and page is still getting tested if it
should be selected and it would get selected. It is wrong.

In (2), the PAGE_IS_FILE bit of page_flags was 0 and got updated to 1 or
PAGE_IS_FILE in tested_flags.

>
>>> IOW my proposal is to replace branches in the masks interpretation (if
>>> in one set then matches but if in another set then doesn't; if flags
>>> match ... ) with plain calculation (flag is matching when equals
>>> ~negated_flags; if flags match the masks ...).
>
> Best Regards
> Michał Mirosław

--
BR,
Muhammad Usama Anjum