Re: [PATCH net v2] NFC: hci: fix sleep in atomic context bugs in nfc_hci_hcp_message_tx

From: duoming
Date: Tue May 17 2022 - 11:28:00 EST


Hello,

On Tue, 17 May 2022 13:42:41 +0200 Krzysztof wrote:

> On 17/05/2022 12:55, Duoming Zhou wrote:
> > There are sleep in atomic context bugs when the request to secure
> > element of st21nfca is timeout. The root cause is that kzalloc and
> > alloc_skb with GFP_KERNEL parameter and mutex_lock are called in
> > st21nfca_se_wt_timeout which is a timer handler. The call tree shows
> > the execution paths that could lead to bugs:
> >
> > (Interrupt context)
> > st21nfca_se_wt_timeout
> > nfc_hci_send_event
> > nfc_hci_hcp_message_tx
> > kzalloc(..., GFP_KERNEL) //may sleep
> > alloc_skb(..., GFP_KERNEL) //may sleep
> > mutex_lock() //may sleep
> >
> > This patch changes allocation mode of kzalloc and alloc_skb from
> > GFP_KERNEL to GFP_ATOMIC and changes mutex_lock to spin_lock in
> > order to prevent atomic context from sleeping.
> >
> > Fixes: 2130fb97fecf ("NFC: st21nfca: Adding support for secure element")
> > Signed-off-by: Duoming Zhou <duoming@xxxxxxxxxx>
> > ---
> > Changes in v2:
> > - Change mutex_lock to spin_lock.
> >
> > include/net/nfc/hci.h | 3 ++-
> > net/nfc/hci/core.c | 18 +++++++++---------
> > net/nfc/hci/hcp.c | 10 +++++-----
> > 3 files changed, 16 insertions(+), 15 deletions(-)
> >
> > diff --git a/include/net/nfc/hci.h b/include/net/nfc/hci.h
> > index 756c11084f6..8f66e6e6b91 100644
> > --- a/include/net/nfc/hci.h
> > +++ b/include/net/nfc/hci.h
> > @@ -103,7 +103,8 @@ struct nfc_hci_dev {
> >
> > bool shutting_down;
> >
> > - struct mutex msg_tx_mutex;
> > + /* The spinlock is used to protect resources related with hci message TX */
> > + spinlock_t msg_tx_spin;
> >
> > struct list_head msg_tx_queue;
> >
> > diff --git a/net/nfc/hci/core.c b/net/nfc/hci/core.c
> > index ceb87db57cd..fa22f9fe5fc 100644
> > --- a/net/nfc/hci/core.c
> > +++ b/net/nfc/hci/core.c
> > @@ -68,7 +68,7 @@ static void nfc_hci_msg_tx_work(struct work_struct *work)
> > struct sk_buff *skb;
> > int r = 0;
> >
> > - mutex_lock(&hdev->msg_tx_mutex);
> > + spin_lock(&hdev->msg_tx_spin);
> > if (hdev->shutting_down)
> > goto exit;
>
> How did you test your patch?
>
> Did you check, really check, that this can be an atomic (non-sleeping)
> section?
>
> I have doubts because I found at least one path leading to device_lock
> (which is a mutex) called within your new code.

The nfc_hci_hcp_message_tx() is called by both process context(hci_dev_up and so on)
and interrupt context(st21nfca_se_wt_timeout()). The process context(hci_dev_up and so on)
calls device_lock, but I think calling spin_lock() within device_lock() is ok. There is
no device_lock() called within spin_lock().

The spinlock could also improve the performance of the program, because processing the
hci messages should be finished in a short time.

> Before sending a new version, please wait for discussion to reach some
> consensus. The quality of these fixes is really poor. :(

Ok, I will wait for discussion to reach consensus.

Best regards,
Duoming Zhou