[098/251] xen-netfront: pull on receive skb may need to happen earlier

From: Steven Rostedt
Date: Thu Sep 12 2013 - 01:40:55 EST


3.6.11.9-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Jan Beulich <JBeulich@xxxxxxxx>

[ Upstream commit 093b9c71b6e450e375f4646ba86faed0195ec7df ]

Due to commit 3683243b ("xen-netfront: use __pskb_pull_tail to ensure
linear area is big enough on RX") xennet_fill_frags() may end up
filling MAX_SKB_FRAGS + 1 fragments in a receive skb, and only reduce
the fragment count subsequently via __pskb_pull_tail(). That's a
result of xennet_get_responses() allowing a maximum of one more slot to
be consumed (and intermediately transformed into a fragment) if the
head slot has a size less than or equal to RX_COPY_THRESHOLD.

Hence we need to adjust xennet_fill_frags() to pull earlier if we
reached the maximum fragment count - due to the described behavior of
xennet_get_responses() this guarantees that at least the first fragment
will get completely consumed, and hence the fragment count reduced.

In order to not needlessly call __pskb_pull_tail() twice, make the
original call conditional upon the pull target not having been reached
yet, and defer the newly added one as much as possible (an alternative
would have been to always call the function right before the call to
xennet_fill_frags(), but that would imply more frequent cases of
needing to call it twice).

Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
Acked-by: Wei Liu <wei.liu2@xxxxxxxxxx>
Cc: Ian Campbell <ian.campbell@xxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx (3.6 onwards)
Acked-by: Ian Campbell <ian.campbell@xxxxxxxxxx>
Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
Signed-off-by: Steven Rostedt <rostedt@xxxxxxxxxxx>
---
drivers/net/xen-netfront.c | 50 ++++++++++++--------------------------------
1 file changed, 13 insertions(+), 37 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index ecac034..110c26c 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -275,8 +275,7 @@ no_skb:
break;
}

- __skb_fill_page_desc(skb, 0, page, 0, 0);
- skb_shinfo(skb)->nr_frags = 1;
+ skb_add_rx_frag(skb, 0, page, 0, 0, PAGE_SIZE);
__skb_queue_tail(&np->rx_batch, skb);
}

@@ -771,7 +770,6 @@ static RING_IDX xennet_fill_frags(struct netfront_info *np,
struct sk_buff_head *list)
{
struct skb_shared_info *shinfo = skb_shinfo(skb);
- int nr_frags = shinfo->nr_frags;
RING_IDX cons = np->rx.rsp_cons;
struct sk_buff *nskb;

@@ -780,19 +778,21 @@ static RING_IDX xennet_fill_frags(struct netfront_info *np,
RING_GET_RESPONSE(&np->rx, ++cons);
skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];

- __skb_fill_page_desc(skb, nr_frags,
- skb_frag_page(nfrag),
- rx->offset, rx->status);
+ if (shinfo->nr_frags == MAX_SKB_FRAGS) {
+ unsigned int pull_to = NETFRONT_SKB_CB(skb)->pull_to;

- skb->data_len += rx->status;
+ BUG_ON(pull_to <= skb_headlen(skb));
+ __pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+ }
+ BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS);
+
+ skb_add_rx_frag(skb, shinfo->nr_frags, skb_frag_page(nfrag),
+ rx->offset, rx->status, PAGE_SIZE);

skb_shinfo(nskb)->nr_frags = 0;
kfree_skb(nskb);
-
- nr_frags++;
}

- shinfo->nr_frags = nr_frags;
return cons;
}

@@ -878,7 +878,8 @@ static int handle_incoming_queue(struct net_device *dev,
while ((skb = __skb_dequeue(rxq)) != NULL) {
int pull_to = NETFRONT_SKB_CB(skb)->pull_to;

- __pskb_pull_tail(skb, pull_to - skb_headlen(skb));
+ if (pull_to > skb_headlen(skb))
+ __pskb_pull_tail(skb, pull_to - skb_headlen(skb));

/* Ethernet work: Delayed to here as it peeks the header. */
skb->protocol = eth_type_trans(skb, dev);
@@ -964,35 +965,10 @@ err:
skb_shinfo(skb)->frags[0].page_offset = rx->offset;
skb_frag_size_set(&skb_shinfo(skb)->frags[0], rx->status);
skb->data_len = rx->status;
+ skb->len += rx->status;

i = xennet_fill_frags(np, skb, &tmpq);

- /*
- * Truesize approximates the size of true data plus
- * any supervisor overheads. Adding hypervisor
- * overheads has been shown to significantly reduce
- * achievable bandwidth with the default receive
- * buffer size. It is therefore not wise to account
- * for it here.
- *
- * After alloc_skb(RX_COPY_THRESHOLD), truesize is set
- * to RX_COPY_THRESHOLD + the supervisor
- * overheads. Here, we add the size of the data pulled
- * in xennet_fill_frags().
- *
- * We also adjust for any unused space in the main
- * data area by subtracting (RX_COPY_THRESHOLD -
- * len). This is especially important with drivers
- * which split incoming packets into header and data,
- * using only 66 bytes of the main data area (see the
- * e1000 driver for example.) On such systems,
- * without this last adjustement, our achievable
- * receive throughout using the standard receive
- * buffer size was cut by 25%(!!!).
- */
- skb->truesize += skb->data_len - RX_COPY_THRESHOLD;
- skb->len += skb->data_len;
-
if (rx->flags & XEN_NETRXF_csum_blank)
skb->ip_summed = CHECKSUM_PARTIAL;
else if (rx->flags & XEN_NETRXF_data_validated)
--
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/