Re: [PATCH v6 2/2] lib: checksum: Use aligned accesses for ip_fast_csum and csum_ipv6_magic tests

From: Guenter Roeck
Date: Mon Feb 12 2024 - 12:18:24 EST


On Mon, Feb 12, 2024 at 06:26:14AM +0000, Al Viro wrote:
> On Wed, Feb 07, 2024 at 04:22:51PM -0800, Charlie Jenkins wrote:
> > + struct csum_ipv6_magic_data {
> > + const struct in6_addr saddr;
> > + const struct in6_addr daddr;
> > + unsigned int len;
> > + __wsum csum;
> > + unsigned char proto;
> > + } data, *data_ptr;
>
> Huh?
>
> > + int num_tests = MAX_LEN / WORD_ALIGNMENT - sizeof(struct csum_ipv6_magic_data);
> > +
> > + for (int i = 0; i < num_tests; i++) {
> > + data_ptr = (struct csum_ipv6_magic_data *)(random_buf + (i * WORD_ALIGNMENT));
> > +
> > + cpu_to_be32_array((__be32 *)&data.saddr, (const u32 *)&data_ptr->saddr,
> > + sizeof_field(struct csum_ipv6_magic_data, saddr) / 4);
> > + cpu_to_be32_array((__be32 *)&data.daddr, (const u32 *)&data_ptr->daddr,
> > + sizeof_field(struct csum_ipv6_magic_data, daddr) / 4);
> > + data.len = data_ptr->len;
> > + data.csum = (__force __wsum)htonl((__force u32)data_ptr->csum);
>
> What are those cpu_to_be32() about? Checksum calculations *DO* *NOT* involve
> any endianness conversions. At any point.
>
> Replace those assignments with memcpy() and be done with that - that will take
> care of unaligned accesses.
>
> Result will have host-independent memory representation. The only place where you
> might want to play with byteswaps (only 16-bit ones) is if you initialized the
> array of expected results with u16 constants. That will have opposite memory
> representations on l-e and b-e, so you'll need to byteswap to compare with
> what you get from function. Alternatively, make it an array of bytes and
> do
> sum16 = csum_ipv6_magic(saddr, daddr, len, proto, csum);
> if (memcmp(sum16, expected_csum_ipv6_magic + i * 2, 2))
> complain
>
> That's it.

Almost. Turns out the csum parameter of csum_ipv6_magic() needs to be in
network byte order, and the length parameter needs to be in host byte order.
So instead of
data.len = data_ptr->len;
data.csum = (__force __wsum)htonl((__force u32)data_ptr->csum);
it needs to be something like
data.len = ntohl(data_ptr->len);
data.csum = data_ptr->csum;

Also, as you mentioned, either the returned checksum or the expected
checksum needs to be converted for the comparison because one is in
network byte order and the other in host byte order.

Address conversions are indeed not needed.

Thanks,
Guenter