Re: [PATCH v2] scsi: 3w-9xxx: Fix endianness issues found by sparse

From: Arnd Bergmann
Date: Wed Aug 05 2020 - 03:18:03 EST


On Wed, Aug 5, 2020 at 3:44 AM Samuel Holland <samuel@xxxxxxxxxxxx> wrote:
> On 8/3/20 9:02 AM, Arnd Bergmann wrote:
> > On Mon, Aug 3, 2020 at 5:42 AM Samuel Holland <samuel@xxxxxxxxxxxx> wrote:
> >> All of the command structures are packed, due to the "#pragma pack(1)" earlier
> >> in the file. So alignment is not an issue. This dma_addr_t member _is_ the
> >> explicit padding to make sizeof(TW_Command) -
> >> sizeof(TW_Command.byte8_offset.{io,param}.sgl) equal TW_COMMAND_SIZE * 4. And
> >> indeed the structure is expected to be a different size depending on
> >> sizeof(dma_addr_t).
> >
> > Ah, so only the first few members are accessed by hardware and the
> > last union is only accessed by the OS then? In that case I suppose it is
> > all fine, but I would also suggest removing the "#pragma packed"
> > to get somewhat more efficient access on systems that have problems
> > with misaligned accesses.
>
> I don't know what part the hardware accesses; everything I know about the
> hardware comes from reading the driver.

I see now from your explanation below that this is a hardware-defined
structure. I was confused by how it can be either 32 or 64 bits wide but
found the

tw_initconnect->features |= sizeof(dma_addr_t) > 4 ? 1 : 0;

line now that tells the hardware about which format is used.

> The problem with removing the "#pragma pack(1)" is that the structure is
> inherently misaligned: byte8_offset.io.sgl starts at offset 12, but it may begin
> with a __le64.

I think a fairly clean way to handle this would be to remove the pragma
and instead define a local type like

#if IS_ENABLED(CONFIG_ARCH_DMA_ADDR_T_64BIT)
typedef __le64 twa_address_t __packed;
#else
typedef __le32 twa_addr_t;
#endif

The problem with marking the entire structure as packed, rather than
just individual members is that you end up with very inefficient bytewise
access on some architectures (especially those without cache-coherent
DMA or hardware unaligned access in the CPU), so I would recommend
avoiding that in portable driver code.

Arnd