Re: [PATCH 12/12] Use down_read_unfair() for /sys/<pid>/exe and /sys/<pid>/maps files

From: Michel Lespinasse
Date: Wed May 12 2010 - 19:36:04 EST


On Wed, May 12, 2010 at 3:53 PM, KOSAKI Motohiro
<kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
>> This helps in the following situation:
>> - Thread A takes a page fault while reading or writing memory.
>>   do_page_fault() acquires the mmap_sem for read and blocks on disk
>>   (either reading the page from file, or hitting swap) for a long time.
>> - Thread B does an mmap call and blocks trying to acquire the mmap_sem
>>   for write
>> - Thread C is a monitoring process trying to read every /proc/pid/maps
>>   in the system. This requires acquiring the mmap_sem for read. Thread C
>>   blocks behind B, waiting for A to release the rwsem.  If thread C
>>   could be allowed to run in parallel with A, it would probably get done
>>   long before thread A's disk access completes, thus not actually slowing
>>   down thread B.
>>
>> Test results with down_read_unfair_test (10 seconds):
>>
>> 2.6.33.3:
>> threadA completes ~600 faults
>> threadB completes ~300 mmap/munmap cycles
>> threadC completes ~600 /proc/pid/maps reads
>>
>> 2.6.33.3 + down_read_unfair:
>> threadA completes ~600 faults
>> threadB completes ~300 mmap/munmap cycles
>> threadC completes ~160000 /proc/pid/maps reads
>>
>> Signed-off-by: Michel Lespinasse <walken@xxxxxxxxxx>
>
> Is it good idea?
> So I think /proc shouldn't use unfair thing as backdoor.
> It doesn't only makes performance improvement, but also
> DoS chance is there.

I am not entirely surprised that there is some level of opposition to
this change (which is in part why it went last in the series).

Besides keeping it internal, would there be ways to make it acceptable
to the community ? For example, would it be fine if unfair behavior
was only used if the caller thread runs with root priviledge ?

In my opinion the optimal behavior would be if the rwsem could be
allowed to be grabbed unfairly only as long as there are still fair
readers on it. However, I don't see how to achieve this given that we
don't want to slow down the regular, fair code paths.

--
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/