Re: [PATCH] fix race caused by hyperthreads when online an offline cpu

From: Stephane Eranian
Date: Mon Jan 16 2017 - 13:36:25 EST


On Mon, Jan 16, 2017 at 1:53 AM, zhouchengming
<zhouchengming1@xxxxxxxxxx> wrote:
> On 2017/1/16 17:05, Thomas Gleixner wrote:
>>
>> On Mon, 16 Jan 2017, Zhou Chengming wrote:
>>
>> Can you please stop sending the same patch over and over every other day?
>>
>> Granted, things get forgotten, but sending a polite reminder after a week
>> is definitely enough.
>>
>> Maintainers are not machines responding within a split second on every
>> mail
>> they get. And that patch is not so substantial that it justifies that kind
>> of spam.
>>
>
> Very sorry for the noise. We are just not sure this is the right fix because
> it's
> hard to reproduce.
>
I believe this is the right fixed. I tried it and instrumented the
code to verify thread_id
assignment. The problem is easy to reproduce.

$ echo 0 >/sys/devices/system/cpu/cpu2/online
$ echo 1 >/sys/devices/system/cpu/cpu2/online

Normally on Haswell Desktop part, CPU2 gets thread_id 0 on boot, CPU6
gets thread_id 1.
If you offline CPU2 and bring it back in, it will get thread_id 1 and
thus both sibling will point
to the same exclusive state. The fix is, indeed, to check if the
sibling is not already assigned 1,
and if so to keep 0 for the CPU being online'd.