Re: [PATCH] net: nixge: Add __packed attribute to DMA descriptor struct

From: Florian Fainelli
Date: Tue Jun 19 2018 - 13:41:52 EST


On 06/19/2018 10:31 AM, Moritz Fischer wrote:
> Hi Florian,
>
> On Tue, Jun 19, 2018 at 10:13 AM, Florian Fainelli <f.fainelli@xxxxxxxxx> wrote:
>> On 06/19/2018 09:54 AM, Moritz Fischer wrote:
>>> Add __packed attribute to DMA descriptor structure in order to
>>> make sure that the DMA engine's alignemnt requirements are met.
>>>
>>> Fixes commit 492caffa8a1a ("net: ethernet: nixge: Add support for
>>> National Instruments XGE netdev")
>>> Signed-off-by: Moritz Fischer <mdf@xxxxxxxxxx>
>>> ---
>>>
>>> Hi David,
>>>
>>> this addresses an issue where padding occured breaking the alignment
>>> in the array the descriptors are allocated in coherent memory.
>>> This was discovered when we tried to bring up the driver via a PCIe
>>> bridge on x86.
>>
>> How could padding be inserted given than all of the structure members
>> are naturally aligned (all u32 type). Compiler bug?
>
> I have no good answer to this, all I can tell you is that it wouldn't work
> otherwise. This was part of a bunch of changes that I made in order
> to make this work with 64bit DMA. I made sure to remove the padding/
> reserved fields accordingly such that the net difference would be zero.
>
> I might've messed that up? The descriptors looked something like this:

Ah ah! This is not the layout in the upstream tree I am looking at, but
your layout below will definitive introduce padding, yes.

>
> struct nixge_hw_dma_bd {
> #ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
> u64 next;
> u64 phys;
> #else
> u32 next;
> u32 reserved1;
> u32 phys;
> u32 reserved2;
> #endif
> u32 reserved3;
> u32 reserved4;
> u32 cntrl;
> u32 status;
> u32 app0;
> u32 app1;
> u32 app2;
> u32 app3;
> u32 app4;
> #ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
> u64 sw_id_offset;
> #else
> u32 sw_id_offset;
> u32 reserved5;
> #endif
> u32 reserved6;
> } __packed;
>
>
> I'll have some follow up patches to add 64bit support together with a
> wrapper driver for the PCIe bridge once the architecture solidifies here.

Why not have the structure updated like this:

struct nixge_hw_dma_bd {
u32 next_lo; /* lower 32-bit address part */
u32 next_hi; /* upper 32-bit address part */
u32 phys_lo;
u32 phys_hi;
u32 reserved3;
u32 reserved4;
u32 cntrl;
u32 status;
u32 app0;
u32 app1;
u32 app2;
u32 app3;
u32 app4;
u32 sw_id_offset_low;
u32 sw_id_offset_hi;
u32 reserved6;
};

That assumes I got the upper/lower address part correct, if not, swapthe
members. Then in your code, if you want to be efficient, you can
populate the fields only if CONFIG_ARCH_DMA_ADDR_T_64BIT is defined. I
did that in bcmgenet.c for instance because the register accesses are
slow, so if we can save 200ns per packet transmit/receive, that's a win.

This should not change anything because the structure size is the same
in both cases, but how you are managing is different, and that would in
turn influence what the HW sees.

Does that make sense?

>
> Thanks for the feedback,
>
> Moritz
>


--
Florian