Re: [PATCH v2] brd: fix sleeping function called from invalid context in brd_insert_page()
From: Jens Axboe
Date: Tue Jul 01 2025 - 10:11:34 EST
On 7/1/25 1:38 AM, Yu Kuai wrote:
> Hi,
>
> 在 2025/07/01 11:00, Jens Axboe 写道:
>> On 6/30/25 7:28 PM, Yu Kuai wrote:
>>> Hi,
>>>
>>> ? 2025/06/30 23:28, Jens Axboe ??:
>>>> On 6/30/25 9:24 AM, Jens Axboe wrote:
>>>>> On 6/30/25 5:28 AM, Yu Kuai wrote:
>>>>>> From: Yu Kuai <yukuai3@xxxxxxxxxx>
>>>>>>
>>>>>> __xa_cmpxchg() is called with rcu_read_lock(), and it will allocate
>>>>>> memory if necessary.
>>>>>>
>>>>>> Fix the problem by moving rcu_read_lock() after __xa_cmpxchg(), meanwhile,
>>>>>> it still should be held before xa_unlock(), prevent returned page to be
>>>>>> freed by concurrent discard.
>>>>>
>>>>> The rcu locking in there is a bit of a mess, imho. What _exactly_ is the
>>>>> rcu read side locking protecting? Is it only needed around the lookup
>>>>> and insert? We even hold it over the kmap and copy, which seems very
>>>>> heavy handed.
>>>>
>>>> Gah it's holding the page alive too. Can't we just grab a ref to the
>>>> page when inserting it, and drop that at free time? It would be a lot
>>>> better to have only the lookup be RCU protected, having the full
>>>> copies under it seems kind of crazy.
>>>
>>> In this case, we must grab a ref to the page for each read/write as
>>> well, I choose RCU because I think it has less performance overhead than
>>> page ref, which is atomic. BTW, I thought copy at most one page is
>>> lightweight, if this is not true, I agree page ref is better.
>>
>> Right, you'd need to grab a ref. I do think that is (by far) the better
>> solution. Yes if you microbenchmark I'm sure the current approach will
>> look fine, but it's a heavy section inside an rcu read lock and will
>> hold off the grace period.
>>
>> So yeah, I do think it'd be a lot better to do proper page references on
>> lookup+free, and have just the lookup be behind rcu.
>>
>
> Ok, and just to be sure, since the rcu is introduced before the fixed
> tag, do you think it's better to do cleanups after this patch, I prefer
> this way, or fix this problem directly by page ref?
Yeah probably best to do the simple fix, and then base the further work
on that.
--
Jens Axboe