Re: [RESEND 1/3] venus: core: handle race condititon for core ops

From: Stanimir Varbanov
Date: Fri Aug 21 2020 - 06:59:27 EST


Hi Mansur,

On 8/10/20 12:50 PM, Stanimir Varbanov wrote:
> Hi Mansur,
>
> Thanks for the patches!
>
> On 8/7/20 9:24 AM, Mansur Alisha Shaik wrote:
>> For core ops we are having only write protect but there
>> is no read protect, because of this in multthreading
>> and concurrency, one CPU core is reading without wait
>> which is causing the NULL pointer dereferece crash.
>>
>> one such scenario is as show below, where in one
>> core core->ops becoming NULL and in another core
>> calling core->ops->session_init().
>>
>> CPU: core-7:
>> Call trace:
>> hfi_session_init+0x180/0x1dc [venus_core]
>> vdec_queue_setup+0x9c/0x364 [venus_dec]
>> vb2_core_reqbufs+0x1e4/0x368 [videobuf2_common]
>> vb2_reqbufs+0x4c/0x64 [videobuf2_v4l2]
>> v4l2_m2m_reqbufs+0x50/0x84 [v4l2_mem2mem]
>> v4l2_m2m_ioctl_reqbufs+0x2c/0x38 [v4l2_mem2mem]
>> v4l_reqbufs+0x4c/0x5c
>> __video_do_ioctl+0x2b0/0x39c
>>
>> CPU: core-0:
>> Call trace:
>> venus_shutdown+0x98/0xfc [venus_core]
>> venus_sys_error_handler+0x64/0x148 [venus_core]
>> process_one_work+0x210/0x3d0
>> worker_thread+0x248/0x3f4
>> kthread+0x11c/0x12c
>>
>> Signed-off-by: Mansur Alisha Shaik <mansur@xxxxxxxxxxxxxx>
>> ---
>> drivers/media/platform/qcom/venus/core.c | 2 +-
>> drivers/media/platform/qcom/venus/hfi.c | 5 ++++-
>> 2 files changed, 5 insertions(+), 2 deletions(-)

See below comment, otherwise:

Acked-by: Stanimir Varbanov <stanimir.varbanov@xxxxxxxxxx>

>>
>> diff --git a/drivers/media/platform/qcom/venus/core.c b/drivers/media/platform/qcom/venus/core.c
>> index 203c653..fe99c83 100644
>> --- a/drivers/media/platform/qcom/venus/core.c
>> +++ b/drivers/media/platform/qcom/venus/core.c
>> @@ -64,8 +64,8 @@ static void venus_sys_error_handler(struct work_struct *work)
>> pm_runtime_get_sync(core->dev);
>>
>> hfi_core_deinit(core, true);
>> - hfi_destroy(core);
>> mutex_lock(&core->lock);
>> + hfi_destroy(core);
>
> As my recovery fixes [1] touches this part also, could you please apply
> them on top of yours and re-test?

I'll drop above chunk from the patch because it is already taken into
account in my recovery fixes series and queue up the patch for v5.10.

>
> Otherwise this patch looks good to me.
>
> [1] https://www.spinics.net/lists/linux-arm-msm/msg70092.html
>
>> venus_shutdown(core);
>>
>> pm_runtime_put_sync(core->dev);
>> diff --git a/drivers/media/platform/qcom/venus/hfi.c b/drivers/media/platform/qcom/venus/hfi.c
>> index a211eb9..2eeb31f 100644
>> --- a/drivers/media/platform/qcom/venus/hfi.c
>> +++ b/drivers/media/platform/qcom/venus/hfi.c
>> @@ -195,7 +195,7 @@ EXPORT_SYMBOL_GPL(hfi_session_create);
>> int hfi_session_init(struct venus_inst *inst, u32 pixfmt)
>> {
>> struct venus_core *core = inst->core;
>> - const struct hfi_ops *ops = core->ops;
>> + const struct hfi_ops *ops;
>> int ret;
>>
>> if (inst->state != INST_UNINIT)
>> @@ -204,10 +204,13 @@ int hfi_session_init(struct venus_inst *inst, u32 pixfmt)
>> inst->hfi_codec = to_codec_type(pixfmt);
>> reinit_completion(&inst->done);
>>
>> + mutex_lock(&core->lock);
>> + ops = core->ops;
>> ret = ops->session_init(inst, inst->session_type, inst->hfi_codec);
>> if (ret)
>> return ret;
>>
>> + mutex_unlock(&core->lock);
>> ret = wait_session_msg(inst);
>> if (ret)
>> return ret;
>>
>

--
regards,
Stan