Re: [RFC PATCH] mm: thp: Add per-mm_struct flag to control THP

From: Oleg Nesterov
Date: Sat Jan 11 2014 - 10:54:23 EST


On 01/10, Alex Thorlton wrote:
>
> This patch adds an mm flag (MMF_THP_DISABLE) to disable transparent
> hugepages using prctl. It is based on my original patch to add a
> per-task_struct flag to disable THP:

I leave the "whether we need this feature" to other reviewers, although
personally I think it probably makes sense anyway.

But the patch doesn't look nice imho.

> @@ -373,7 +373,15 @@ extern int get_dumpable(struct mm_struct *mm);
> #define MMF_HAS_UPROBES 19 /* has uprobes */
> #define MMF_RECALC_UPROBES 20 /* MMF_HAS_UPROBES can be wrong */
>
> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +#define MMF_THP_DISABLE 21 /* disable THP for this mm */
> +#define MMF_THP_DISABLE_MASK (1 << MMF_THP_DISABLE)
> +
> +#define MMF_INIT_MASK (MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK | MMF_THP_DISABLE_MASK)
> +#else
> #define MMF_INIT_MASK (MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK)
> +#endif

It would be nice to lessen the number of ifdef's. Why we can't define
MMF_THP_DISABLE unconditionally and include it into MMF_INIT_MASK?
Or define it == 0 if !CONFIG_THP. But this is minor.

> +#define PR_SET_THP_DISABLE 41
> +#define PR_CLEAR_THP_DISABLE 42
> +#define PR_GET_THP_DISABLE 43

Why we can't add 2 PR_'s, set and get?

> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -818,6 +818,7 @@ struct mm_struct *dup_mm(struct task_struct *tsk)
> #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS
> mm->pmd_huge_pte = NULL;
> #endif
> +
> if (!mm_init(mm, tsk))
> goto fail_nomem;

Why? looks like the accidental change.

> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -1835,6 +1835,42 @@ static int prctl_get_tid_address(struct task_struct *me, int __user **tid_addr)
> }
> #endif
>
> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +static int prctl_set_thp_disable(struct task_struct *me)
> +{
> + set_bit(MMF_THP_DISABLE, &me->mm->flags);
> + return 0;
> +}
> +
> +static int prctl_clear_thp_disable(struct task_struct *me)
> +{
> + clear_bit(MMF_THP_DISABLE, &me->mm->flags);
> + return 0;
> +}
> +
> +static int prctl_get_thp_disable(struct task_struct *me,
> + int __user *thp_disabled)
> +{
> + return put_user(test_bit(MMF_THP_DISABLE, &me->mm->flags), thp_disabled);
> +}
> +#else
> +static int prctl_set_thp_disable(struct task_struct *me)
> +{
> + return -EINVAL;
> +}
> +
> +static int prctl_clear_thp_disable(struct task_struct *me)
> +{
> + return -EINVAL;
> +}
> +
> +static int prctl_get_thp_disable(struct task_struct *me,
> + int __user *thp_disabled)
> +{
> + return -EINVAL;
> +}
> +#endif
> +
> SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
> unsigned long, arg4, unsigned long, arg5)
> {
> @@ -1998,6 +2034,15 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
> if (arg2 || arg3 || arg4 || arg5)
> return -EINVAL;
> return current->no_new_privs ? 1 : 0;
> + case PR_SET_THP_DISABLE:
> + error = prctl_set_thp_disable(me);
> + break;
> + case PR_CLEAR_THP_DISABLE:
> + error = prctl_clear_thp_disable(me);
> + break;
> + case PR_GET_THP_DISABLE:
> + error = prctl_get_thp_disable(me, (int __user *) arg2);
> + break;
> default:
> error = -EINVAL;
> break;

I simply can't understand, this all looks like overkill. Can't you simply add

#idfef CONFIG_TRANSPARENT_HUGEPAGE
case GET:
error = test_bit(MMF_THP_DISABLE);
break;
case PUT:
if (arg2)
set_bit();
else
clear_bit();
break;
#endif

into sys_prctl() ?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/