Re: [PATCH v2] staging: tidspbridge: protect dmm_map properly

From: Felipe Contreras
Date: Tue Dec 28 2010 - 07:12:58 EST


On Tue, Dec 28, 2010 at 12:56 PM, Ohad Ben-Cohen <ohad@xxxxxxxxxx> wrote:
> On Tue, Dec 28, 2010 at 12:36 PM, Felipe Contreras
> <felipe.contreras@xxxxxxxxx> wrote:
>>> You can never really tell who is using the kernel (or will be using
>>> this kernel version), how and under which workload.
>>
>> No, but it's better to address real issues rather than hypothetical.
>
> Right. and both patches do that. one adds locks, and one doesn't.
>
>> However, as I sad, everybody is using proc_map() and proc_un_map()
>> which take a lock, and there are no complaints.
>
> I didn't complain about that; I didn't like you adding locks to the DMA API.
>
>> Ok, can I get your Ack?
>
> Frankly, I don't like the locks you are adding. But as I said, I
> wouldn't resist them as long as it's temporary.
>
>> Then why did you add that check for is_map_obj_used(), and then return
>> -EBUSY? If that can happen, then it can happen when the application is
>> crashing; user-space crashes while kernel-space is in the middle of a
>> proc_*_dma() operation.
>
> I still don't know how exactly you triggered the bug: is gst-dsp
> multithreaded ? and one of its threads invoked proc_un_map() while
> another thread called proc_begin_dma() ?

I haven't investigated why that happens, but kernel-space should not
panic regardless of what user-space does.

> Anyhow, a thread that is calling proc_*_dma() will both increase the
> reference count and decrease it back before going back to user space.
> Otherwise your patch would be problematic as well - who will unlock
> the mutex you take in proc_*_dma() ?

I'm saying that user-space might crash *before* proc_*_dma() finishes,
before the reference count has been decreased.

In my patch there would be no issue because proc_un_map() would wait
until proc_*_dma() has released the lock.

>>>> Sure, but I see this as a broader effort to have finer locking, part of
>>>> this should be to remove the already existing proc_lock.
>>>
>>> Having bad locking is not an excuse for adding more.
>>
>> No, but not being a permanent solution is not an excuse for not fixing
>> a kernel panic right away.
>
> Right. But we have a fix that doesn't add any additional locking... I
> don't see why it can't be taken now, but as I said, I wouldn't resist
> staging it for the next cycle.

Because:
1) Your patch changes 114 lines, mine 18
2) It hasn't been reviewed, nor tested by other people
3) At least I see a potential issue
4) 2.6.37 is imminent

IMO one patch has chances going into 2.6.37, the other one doesn't. I
don't see the problem of pushing my patch to 2.6.37, and once your
patch has been properly reviewed, and tested, put it for the 2.6.38
merge window. Anyway, it's looking more and more that this patch would
not be ack'ed in time, so I guess we would have to wait to see what
other people (Omar?) say, which would probably be 2.6.38 timeline.

Cheers.

--
Felipe Contreras
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/