Re: [PATCH 1/2] bluetooth: don't include local processing of HCI commands in the command timeout

From: Marcel Holtmann
Date: Sat May 31 2014 - 01:28:16 EST


Hi Alexander,

> I assume the timeout for processing HCI commands was originally intended to
> detect hung bluetooth devices and should not include the time needed locally
> to handle the response to an HCI command. That is important because the time
> needed locally (by the kernel or even userland) to process responses to HCI
> commands varies a lot between systems and HCI commands. That's even more true
> since many actions to HCI command responses are handled inside works which
> might be delayed quiet some time, depending on the actual system load.
>
> So stop the timeout as soon as a response to an HCI command was received.
>
> This fixes various problems which resulted in HCI command timeouts and an
> afterwards non-working bluetooth stack, especially on slower systems like
> some ARM devices.
>
> Drawback is that in-kernel problems like deadlocks aren't detected by HCI
> command timeouts anymore, but such problems should be detected and handled
> by other means and not by a timeout where it is hard to specify a value
> reasonable for all possible systems (-configurations, -loads).
>
> Furthermore, if the timeout includes local processing of HCI command
> responses, in-kernel errors like hung tasks might be masked by the
> timeout, because the hung task would be killed by the timeout before
> the hung task would be detected (by other means).
>
> Signed-off-by: Alexander Holler <holler@xxxxxxxxxxxxx>
> ---
> net/bluetooth/hci_event.c | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
> index 15010a2..94c2dc0 100644
> --- a/net/bluetooth/hci_event.c
> +++ b/net/bluetooth/hci_event.c
> @@ -2338,6 +2338,9 @@ static void hci_cmd_complete_evt(struct hci_dev *hdev, struct sk_buff *skb)
>
> opcode = __le16_to_cpu(ev->opcode);
>
> + if (opcode != HCI_OP_NOP)
> + del_timer(&hdev->cmd_timer);
> +

so I actually wonder if we should move away from timer and move to a delayed work item to handle the timeout and if that would actually fix this issue.

Regards

Marcel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/