Re: [PATCH 2/2] cpuhotplug: make get_online_cpus() scalability byusing percpu counter

From: Lai Jiangshan
Date: Mon Apr 12 2010 - 08:31:11 EST


Peter Zijlstra wrote:
> On Mon, 2010-04-12 at 17:24 +0800, Lai Jiangshan wrote:
>> Oleg Nesterov wrote:
>>> On 04/07, Oleg Nesterov wrote:
>>>> On 04/07, Lai Jiangshan wrote:
>>>>> Old get_online_cpus() is read-preference, I think the goal of this ability
>>>>> is allow get_online_cpus()/put_online_cpus() to be called nested.
>>>> Sure, I understand why you added task_struct->get_online_cpus_nest.
>>>>
>>>>> and use per-task counter for allowing get_online_cpus()/put_online_cpus()
>>>>> to be called nested, I think this deal is absolutely worth.
>>>> As I said, I am not going to argue. I can't justify this tradeoff.
>>> But, I must admit, I'd like to avoid adding the new member to task_struct.
>>>
>>> What do you think about the code below?
>>>
>>> I didn't even try to compile it, just to explain what I mean.
>>>
>>> In short: we have the per-cpu fast counters, plus the slow counter
>>> which is only used when cpu_hotplug_begin() is in progress.
>>>
>>> Oleg.
>>>
>> get_online_cpus() in your code is still read-preference.
>> I wish we quit this ability of get_online_cpus().
>
> Why?

Because read-preference RWL will cause write site starvation.

A user run the following code will cause cpuhotplug starvation.
(100 processes run sched_setaffinity().)

Lai

#define _GNU_SOURCE
#include <sched.h>
#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>

#define NCPU 4
#define NPROCESS 100

cpu_set_t set;
pid_t target;

void stress_test(void)
{
int cpu;

srand((int)target);
for (;;) {
cpu = rand() % NCPU;
CPU_SET(cpu, &set);
sched_setaffinity(target, sizeof(set), &set);
CPU_CLR(cpu, &set);
}
}

int main(int argc, char *argv[])
{
pid_t ret;
int i;

target = getpid();
for (i = 1; i < NPROCESS; i++) {
ret = fork();
if (ret < 0)
break;
else if (ret)
target = ret;
else
stress_test();
}

stress_test();
}



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/