Re: [PATCH 6/6] tilegx network driver: initial support

From: Arnd Bergmann
Date: Tue Apr 10 2012 - 06:42:44 EST


On Monday 09 April 2012, Chris Metcalf wrote:
> On 4/9/2012 9:49 AM, Arnd Bergmann wrote:
> > On Friday 06 April 2012, Chris Metcalf wrote:
> >> This change adds support for the tilegx network driver based on the
> >> GXIO IORPC support in the tilegx software stack, using the on-chip
> >> mPIPE packet processing engine.
> >>
> >> Signed-off-by: Chris Metcalf <cmetcalf@xxxxxxxxxx>
> >> ---
> >> drivers/net/ethernet/tile/Kconfig | 1 +
> >> drivers/net/ethernet/tile/Makefile | 4 +-
> >> drivers/net/ethernet/tile/tilegx.c | 2045 ++++++++++++++++++++++++++++++++++++
> >> 3 files changed, 2048 insertions(+), 2 deletions(-)
> >> create mode 100644 drivers/net/ethernet/tile/tilegx.c
> > I think the directory name should be the company, not the architecture here, so make
> > it drivers/net/ethernet/tilera/tilegx.c instead.
>
> This path was picked back when Jeff Kirsher did the initial move into
> drivers/net/ethernet/ for the tilepro driver. I don't have too strong an
> opinion on this; at this point I'm mostly just concerned that it seems like
> potentially not worth the churn to move the files for 3.2, then again for
> 3.5. But if folks agree we should do it, it's fine with me.

Ah, I didn't realize that the directory already exists. It's probably better
not to move it then.

> The actual author would rather not publish his name (I just double-checked
> with him).

Hmm, it doesn't look all that bad actually, the comments I had are just for
small details.

> >> +/* The actual devices. */
> >> +static struct net_device *tile_net_devs[TILE_NET_DEVS];
> >> +
> >> +/* The device for a given channel. HACK: We use "32", not
> >> + * TILE_NET_CHANNELS, because it is fairly subtle that the 5 bit
> >> + * "idesc.channel" field never exceeds TILE_NET_CHANNELS.
> >> + */
> >> +static struct net_device *tile_net_devs_for_channel[32];
> > When you need to keep a list or array of device structures in a driver, you're
> > usually doing something very wrong. The convention is to just pass the pointer
> > around to where you need it.
>
> We need "tile_net_devs_for_channel" because we share a single hardware
> queue for all devices, and each packet's metadata contains a "channel"
> value which indicates the device.

Ok, but please remove tile_net_devs then.

I think a better abstraction for tile_net_devs_for_channel would be
some interface that lets you add private data to a channel so when
you get data from a channel, you can extract that pointer from the driver
using the channel.

Don't you already have a per-channel data structure?

>
> /*
> * The on-chip I/O hardware on tilegx is configured with VA=PA for the
> * kernel's PA range. The low-level APIs and field names use "va" and
> * "void *" nomenclature, to be consistent with the general notion
> * that the addresses in question are virtualizable, but in the kernel
> * context we are actually manipulating PA values. To allow readers
> * of the code to understand what's happening, we direct their
> * attention to this comment by using the following two no-op functions.
> */
> static inline unsigned long pa_to_tile_io_addr(phys_addr_t pa)
> {
> BUILD_BUG_ON(sizeof(phys_addr_t) != sizeof(unsigned long));
> return pa;
> }
> static inline phys_addr_t tile_io_addr_to_pa(unsigned long tile_io_addr)
> {
> return tile_io_addr;
> }
>
> Then the individual uses in the network driver are just things like
> "edesc_head.va = pa_to_tile_io_addr(__pa(va))" or "va =
> __va(tile_io_addr_to_pa((unsigned long)gxio_mpipe_idesc_get_va(idesc)))"
> which I think is a little clearer.

Yes, although I would probably add a typedef for tile_io_addr and pass
the virtual address in and out these helper functions.

For added clarity, you could make the interface look like dma_map_single(),
which requires adding an empty unmap() function as well -- that would
make it obvious where that data is actually used. Why do you require
the reverse map anyway? Normally you only need to pass a bus address to
the device but don't need to translate that back into a virtual address
because you already had that in the beginning.

> >> +/* Allocate and push a buffer. */
> >> +static bool tile_net_provide_buffer(bool small)
> >> +{
> >> [...]
> >> +
> >> + /* Save a back-pointer to 'skb'. */
> >> + *(struct sk_buff **)(skb->data - sizeof(struct sk_buff **)) = skb;
> > This looks very wrong: why would you put the pointer to the skb into the
> > skb itself?
>
> Because we create skbuffs, and then feed the raw underlying buffer storage
> to our hardware, and later, we get back this raw pointer from hardware,
> from which we need to be able to extract the actual skbuff.

Hmm, this sounds very unusual, but I don't really have a better suggestion
here.

> >> + /* Compute the "ip checksum". */
> >> + jsum = isum_hack + htons(s_len - eh_len) + htons(id);
> >> + jsum = __insn_v2sadu(jsum, 0);
> >> + jsum = __insn_v2sadu(jsum, 0);
> >> + jsum = (0xFFFF ^ jsum);
> >> + jh->check = jsum;
> >> +
> >> + /* Update the tcp "seq". */
> >> + uh->seq = htonl(seq);
> >> +
> >> + /* Update some flags. */
> >> + if (!final)
> >> + uh->fin = uh->psh = 0;
> >> +
> >> + /* Compute the tcp pseudo-header checksum. */
> >> + usum = tsum_hack + htons(s_len);
> >> + usum = __insn_v2sadu(usum, 0);
> >> + usum = __insn_v2sadu(usum, 0);
> >> + uh->check = usum;
> > Why to you open-code the ip checksum functions here? Normally the stack takes
> > care of this by calling the functions you already provide in
> > arch/tile/lib/checksum.c
>
> If there is a way to do TSO without this, we'd be happy to hear it, but
> it's not clear how it would be possible. We are only computing a PARTIAL
> checksum here, and letting the hardware compute the "full" checksum.

Sounds like you're looking for csum_partial() ;-)

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/