Re: [PATCH] cpumask: convert cpumask_of_cpu() with cpumask_of()

From: KOSAKI Motohiro
Date: Tue Apr 26 2011 - 07:33:30 EST


> On Mon, 2011-04-25 at 18:41 +0900, KOSAKI Motohiro wrote:
> > This patch adapt the code to new api fashion.
> >
> > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx>
> > Cc: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> > Cc: Mike Galbraith <efault@xxxxxx>
> > Cc: Ingo Molnar <mingo@xxxxxxx>
> > ---
> > kernel/kthread.c | 2 +-
> > 1 files changed, 1 insertions(+), 1 deletions(-)
> >
> > diff --git a/kernel/kthread.c b/kernel/kthread.c
> > index 3b34d27..4102518 100644
> > --- a/kernel/kthread.c
> > +++ b/kernel/kthread.c
> > @@ -202,7 +202,7 @@ void kthread_bind(struct task_struct *p, unsigned int cpu)
> > return;
> > }
> >
> > - p->cpus_allowed = cpumask_of_cpu(cpu);
> > + cpumask_copy(&p->cpus_allowed, cpumask_of(cpu));
> > p->rt.nr_cpus_allowed = 1;
> > p->flags |= PF_THREAD_BOUND;
> > }
>
> But why? Are we going to get rid of cpumask_t (which is a fixed sized
> struct to direct assignment is perfectly fine)?
>
> Also, do we want to convert cpus_allowed to cpumask_var_t? It would save
> quite a lot of memory on distro configs that set NR_CPUS silly high.
> Currently NR_CPUS=4096 configs allocate 512 bytes per task for this
> bitmap, 511 of which will never be used on most machines (510 in the
> near future).
>
> The cost if of course an extra memory dereference in scheduler hot
> paths.. also not nice.

To be honest, I dislike current cpumask_size() always return NR_CPUS. It
screw up almost all of cpumask_var_t benefit. But, we have to eliminate
all dangerous = operator usage before changing its implementation.

(btw, I wonder this limitation doesn't documented at all in code. Should
we add this?)

Thus, now I'm finding all of =operator by tool and replacing it. The second
motivation of eliminating old api is to easy detect obsolete usage by tool.

So, I personally hope task->cpus_allowed convert to cpumask_var_t
as cpuset->cpus_allowed. For some years, storage guys repeatedly alert
stack overflow chance is increasing as time goes.

However, yes, extra memory dereference is also bad. If scheduler folks
dislike cpumask_var_t, I can drop to think convert task->cpus_allowed.

But, if we can't convert cpus_allowed, I'd like to move it into last of
task_struct. because usually cpu_allowed is only accessed first byte
(as you described).

Thanks.