Re: [PATCH] net/tcp: introduce TRACE_EVENT for TCP/IPv4 state transition

From: Alexei Starovoitov
Date: Thu Nov 09 2017 - 01:43:52 EST


On Thu, Nov 09, 2017 at 02:01:38PM +0800, Yafang Shao wrote:
> With this newly introduced TRACE_EVENT, it will be very easy to minotor
> TCP/IPv4 state transition.
>
> A new TRACE_SYSTEM named tcp is added, in which we can trace other TCP
> event as well.
>
> Two helpers are added,
> static inline void __tcp_set_state(struct sock *sk, int state)
> static inline void __sk_state_store(struct sock *sk, int newstate)
>
> When do TCP/IPv4 state transition, we should use these two helpers or
> use tcp_set_state() instead of assign a value to sk_state directly.
>
> Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx>

when you submit a patch pls make it clear which tree this patch is targeting.
In this case it should have been net-next,
but the patch clearly conflicts with it.
Make sure to rebase.

> +/*
> + * To trace TCP state transition.
> + */
> +static inline void __tcp_set_state(struct sock *sk, int state)
> +{
> + trace_tcp_set_state(sk, sk->sk_state, state);
> + sk->sk_state = state;
> +}
> +
> +static inline void __sk_state_store(struct sock *sk, int newstate)
> +{
> + trace_tcp_set_state(sk, sk->sk_state, newstate);
> + sk_state_store(sk, newstate);
> +}
> +
> void tcp_done(struct sock *sk);
>
> int tcp_abort(struct sock *sk, int err);
> diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
> new file mode 100644
> index 0000000..abf65af
> --- /dev/null
> +++ b/include/trace/events/tcp.h
> @@ -0,0 +1,58 @@
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM tcp
> +
> +#if !defined(_TRACE_TCP_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_TCP_H
> +
> +#include <linux/tracepoint.h>
> +#include <net/sock.h>
> +#include <net/inet_timewait_sock.h>
> +#include <net/request_sock.h>
> +#include <net/inet_sock.h>
> +#include <net/tcp_states.h>
> +
> +TRACE_EVENT(tcp_set_state,
> + TP_PROTO(struct sock *sk, int oldstate, int newstate),
> + TP_ARGS(sk, oldstate, newstate),
> +
> + TP_STRUCT__entry(
> + __field(__be32, dst)
> + __field(__be32, src)
> + __field(__u16, dport)
> + __field(__u16, sport)
> + __field(int, oldstate)
> + __field(int, newstate)
> + ),
> +
> + TP_fast_assign(
> + if (oldstate == TCP_TIME_WAIT) {
> + __entry->dst = inet_twsk(sk)->tw_daddr;
> + __entry->src = inet_twsk(sk)->tw_rcv_saddr;
> + __entry->dport = ntohs(inet_twsk(sk)->tw_dport);
> + __entry->sport = ntohs(inet_twsk(sk)->tw_sport);
> + } else if (oldstate == TCP_NEW_SYN_RECV) {
> + __entry->dst = inet_rsk(inet_reqsk(sk))->ir_rmt_addr;
> + __entry->src = inet_rsk(inet_reqsk(sk))->ir_loc_addr;
> + __entry->dport =
> + ntohs(inet_rsk(inet_reqsk(sk))->ir_rmt_port);
> + __entry->sport = inet_rsk(inet_reqsk(sk))->ir_num;
> + } else {
> + __entry->dst = inet_sk(sk)->inet_daddr;
> + __entry->src = inet_sk(sk)->inet_rcv_saddr;
> + __entry->dport = ntohs(inet_sk(sk)->inet_dport);
> + __entry->sport = ntohs(inet_sk(sk)->inet_sport);
> + }
> +
> + __entry->oldstate = oldstate;
> + __entry->newstate = newstate;
> + ),
> +
> + TP_printk("%08X:%04X %08X:%04X, %02x %02x",
> + __entry->src, __entry->sport, __entry->dst, __entry->dport,
> + __entry->oldstate, __entry->newstate)

direct %x of state is not allowed.
This has to use show_tcp_state_name() like it's done in trace_tcp_set_state

Also I'm missing the reason to introduce another tracepoint
that looks just like trace_tcp_set_state.