Re: [PATCH] mm/hugetlb: Fix a race between hugetlb sysctl handlers

From: Mike Kravetz
Date: Mon Aug 24 2020 - 17:21:31 EST


On 8/24/20 1:59 PM, Andrew Morton wrote:
> On Sat, 22 Aug 2020 17:53:28 +0800 Muchun Song <songmuchun@xxxxxxxxxxxxx> wrote:
>
>> There is a race between the assignment of `table->data` and write value
>> to the pointer of `table->data` in the __do_proc_doulongvec_minmax().
>
> Where does __do_proc_doulongvec_minmax() write to table->data?
>
> I think you're saying that there is a race between the assignment of
> ctl_table->table in hugetlb_sysctl_handler_common() and the assignment
> of the same ctl_table->table in hugetlb_overcommit_handler()?
>
> Or not, maybe I'm being thick. Can you please describe the race more
> carefully and completely?
>

I too am looking at this now and do not completely understand the race.
It could be that:

hugetlb_sysctl_handler_common
...
table->data = &tmp;

and, do_proc_doulongvec_minmax()
...
return __do_proc_doulongvec_minmax(table->data, table, write, ...
with __do_proc_doulongvec_minmax(void *data, struct ctl_table *table, ...
...
i = (unsigned long *) data;
...
*i = val;

So, __do_proc_doulongvec_minmax can be dereferencing and writing to the pointer
in one thread when hugetlb_sysctl_handler_common is setting it in another?

Another confusing part of the message is the stack trace which includes
...
? set_max_huge_pages+0x3da/0x4f0
? alloc_pool_huge_page+0x150/0x150

which are 'downstream' from these routines. I don't understand why these
are in the trace.

If the race is with the pointer set and dereference/write, then this type of
fix is OK. However, if you really have two 'sysadmin type' global operations
racing then one or both are not going to get what they expected. Instead of
changing the code to 'handle the race', I think it might be acceptable to just
put a big semaphore around it.
--
Mike Kravetz