Re: [PATCH net-next v4 4/4] net: tun: track dropped skb via kfree_skb_reason()

From: Dongli Zhang
Date: Wed Mar 02 2022 - 17:22:50 EST


Hi Jakub,

On 3/2/22 11:17 AM, Jakub Kicinski wrote:
> On Wed, 2 Mar 2022 10:19:30 -0800 Dongli Zhang wrote:
>> On 3/1/22 6:50 PM, Jakub Kicinski wrote:
>>> On Sat, 26 Feb 2022 00:49:29 -0800 Dongli Zhang wrote:
>>>> + SKB_DROP_REASON_SKB_PULL, /* failed to pull sk_buff data */
>>>> + SKB_DROP_REASON_SKB_TRIM, /* failed to trim sk_buff data */
>>>
>>> IDK if these are not too low level and therefore lacking meaning.
>>>
>>> What are your thoughts David?
>>>
>>> Would it be better to up level the names a little bit and call SKB_PULL
>>> something like "HDR_TRUNC" or "HDR_INV" or "HDR_ERR" etc or maybe
>>> "L2_HDR_ERR" since in this case we seem to be pulling off ETH_HLEN?
>>
>> This is for device driver and I think for most of cases the people understanding
>> source code will be involved. I think SKB_PULL is more meaningful to people
>> understanding source code.
>>
>> The functions to pull data to skb is commonly used with the same pattern, and
>> not only for ETH_HLEN. E.g., I randomly found below in kernel source code.
>>
>> 1071 static rx_handler_result_t macsec_handle_frame(struct sk_buff **pskb)
>> 1072 {
>> ... ...
>> 1102 pulled_sci = pskb_may_pull(skb, macsec_extra_len(true));
>> 1103 if (!pulled_sci) {
>> 1104 if (!pskb_may_pull(skb, macsec_extra_len(false)))
>> 1105 goto drop_direct;
>> 1106 }
>> ... ...
>> 1254 drop_direct:
>> 1255 kfree_skb(skb);
>> 1256 *pskb = NULL;
>> 1257 return RX_HANDLER_CONSUMED;
>>
>>
>> About 'L2_HDR_ERR', I am curious what the user/administrator may do as next
>> step, while the 'SKB_PULL' will be very clear to the developers which kernel
>> operation (e.g., to pull some protocol/hdr data to sk_buff data) is with the issue.
>>
>> I may use 'L2_HDR_ERR' if you prefer.
>
> We don't have to break it out per layer if you prefer. Let's call it
> HDR_TRUNC.
>
> I don't like SKB_PULL, people using these trace points are as likely
> to be BPF developers as kernel developers and skb_pull will not be
> meaningful to them. Besides the code can check if header is not
> truncated in other ways than pskb_may_pull(). And calling things
> by the name of the helper that failed is bad precedent.

I will switch to SKB_DROP_REASON_HDR_TRUNC.

>
>>> For SKB_TRIM the error comes from allocation failures, there may be
>>> a whole bunch of skb helpers which will fail only under mem pressure,
>>> would it be better to identify them and return some ENOMEM related
>>> reason, since, most likely, those will be noise to whoever is tracking
>>> real errors?
>>
>> The reasons I want to use SKB_TRIM:
>>
>> 1. To have SKB_PULL and SKB_TRIM (perhaps more SKB_XXX in the future in the same
>> set).
>>
>> 2. Although so that SKB_TRIM is always caused by ENOMEM, suppose if there is new
>> return values by pskb_trim(), the reason is not going to be valid any longer.
>>
>>
>> I may use SKB_DROP_REASON_NOMEM if you prefer.
>>
>> Another concern is that many functions may return -ENOMEM. It is more likely
>> that if there are two "goto drop" to return -ENOMEM, we will not be able to tell
>> from which function the sk_buff is dropped, e.g.,
>>
>> if (function_A()) {
>> reason = -ENOMEM;
>> goto drop;
>> }
>>
>> if (function_B()) {
>> reason = -ENOMEM;
>> goto drop;
>> }
>
> Are you saying that you're intending to break out skb drop reasons
> by what entity failed to allocate memory? I'd think "skb was dropped

Yes.

> because of OOM" is what should be reported. What we were trying to
> allocate is not very relevant (and can be gotten from the stack trace
> if needed).

I think OOM is not enough. Although it may not be the case in this patchset,
sometimes the allocation is failed because we are allocating a large chunk of
physically continuous pages (kmalloc vs. vmalloc) while there is still plenty of
memory pages available.

As a kernel developer, it is very significant for me to identify the specific
line/function and specific data structure that cause the error. E.g, the bug
filer may be chasing which line is making trouble.

It is less likely to SKB_TRIM more than once in a driver function, compared to
ENOMEM.

I am the user of this patchset and I prefer to make my work easier in the future :)

>
>>>> SKB_DROP_REASON_DEV_HDR, /* there is something wrong with
>>>> * device driver specific header
>>>> */
>>>> + SKB_DROP_REASON_DEV_READY, /* device is not ready */
>>>
>>> What is ready? link is not up? peer not connected? can we expand?
>>
>> In this patchset, it is for either:
>>
>> - tun->tfiles[txq] is not set, or
>>
>> - !(tun->dev->flags & IFF_UP)
>>
>> I want to make it very generic so that the sk_buff dropped due to any device
>> level data structure that is not up/ready/initialized/allocated will use this
>> reason in the future.
>
> Let's expand the documentation so someone reading thru the enum can
> feel confident if they are using this reason correctly.
>
> Side note - you may want to switch to inline comments to make writing
> more verbose documentation, I mean:
>
> /* This is the explanation of reason one which explains what
> * reason ones means, and how it should be used. We can make
> * use of full line width this way.
> */
> SKB_DROP_REASON_ONE,
> /* And this is an explanation for reason two. */
> SKB_DROP_REASON_TWO,
>

I will expand the comments.

Thank you very much!

Dongli Zhang