[ 093/108] sctp: Dont charge for data in sndbuf again when transmitting packet

From: Ben Hutchings
Date: Sun Oct 07 2012 - 19:34:36 EST


3.2-stable review patch. If anyone has any objections, please let me know.

------------------

From: Thomas Graf <tgraf@xxxxxxx>

[ Upstream commit 4c3a5bdae293f75cdf729c6c00124e8489af2276 ]

SCTP charges wmem_alloc via sctp_set_owner_w() in sctp_sendmsg() and via
skb_set_owner_w() in sctp_packet_transmit(). If a sender runs out of
sndbuf it will sleep in sctp_wait_for_sndbuf() and expects to be waken up
by __sctp_write_space().

Buffer space charged via sctp_set_owner_w() is released in sctp_wfree()
which calls __sctp_write_space() directly.

Buffer space charged via skb_set_owner_w() is released via sock_wfree()
which calls sk->sk_write_space() _if_ SOCK_USE_WRITE_QUEUE is not set.
sctp_endpoint_init() sets SOCK_USE_WRITE_QUEUE on all sockets.

Therefore if sctp_packet_transmit() manages to queue up more than sndbuf
bytes, sctp_wait_for_sndbuf() will never be woken up again unless it is
interrupted by a signal.

This could be fixed by clearing the SOCK_USE_WRITE_QUEUE flag but ...

Charging for the data twice does not make sense in the first place, it
leads to overcharging sndbuf by a factor 2. Therefore this patch only
charges a single byte in wmem_alloc when transmitting an SCTP packet to
ensure that the socket stays alive until the packet has been released.

This means that control chunks are no longer accounted for in wmem_alloc
which I believe is not a problem as skb->truesize will typically lead
to overcharging anyway and thus compensates for any control overhead.

Signed-off-by: Thomas Graf <tgraf@xxxxxxx>
CC: Vlad Yasevich <vyasevic@xxxxxxxxxx>
CC: Neil Horman <nhorman@xxxxxxxxxxxxx>
CC: David Miller <davem@xxxxxxxxxxxxx>
Acked-by: Vlad Yasevich <vyasevich@xxxxxxxxx>
Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>
Signed-off-by: Ben Hutchings <ben@xxxxxxxxxxxxxxx>
---
net/sctp/output.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/net/sctp/output.c b/net/sctp/output.c
index 8fc4dcd..32ba8d0 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -334,6 +334,25 @@ finish:
return retval;
}

+static void sctp_packet_release_owner(struct sk_buff *skb)
+{
+ sk_free(skb->sk);
+}
+
+static void sctp_packet_set_owner_w(struct sk_buff *skb, struct sock *sk)
+{
+ skb_orphan(skb);
+ skb->sk = sk;
+ skb->destructor = sctp_packet_release_owner;
+
+ /*
+ * The data chunks have already been accounted for in sctp_sendmsg(),
+ * therefore only reserve a single byte to keep socket around until
+ * the packet has been transmitted.
+ */
+ atomic_inc(&sk->sk_wmem_alloc);
+}
+
/* All packets are sent to the network through this function from
* sctp_outq_tail().
*
@@ -375,7 +394,7 @@ int sctp_packet_transmit(struct sctp_packet *packet)
/* Set the owning socket so that we know where to get the
* destination IP address.
*/
- skb_set_owner_w(nskb, sk);
+ sctp_packet_set_owner_w(nskb, sk);

if (!sctp_transport_dst_check(tp)) {
sctp_transport_route(tp, NULL, sctp_sk(sk));


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/