Re: [PATCH net 1/2] r8152: fix the sw rx checksum is unavailable

From: Mark Lord
Date: Wed Nov 23 2016 - 14:29:53 EST


On 16-11-23 10:12 AM, Hayes Wang wrote:
Mark Lord [mlord@xxxxxxxxx]
[...]
What does this code do:

static void r8153_set_rx_early_size(struct r8152 *tp)
{
u32 mtu = tp->netdev->mtu;
u32 ocp_data = (agg_buf_sz - mtu - VLAN_ETH_HLEN - VLAN_HLEN) / 4;

ocp_write_word(tp, MCU_TYPE_USB, USB_RX_EARLY_SIZE, ocp_data);
}

This only works for RTL8153. However, what you use is RTL8152.
It is like delay completion. It is used to reduce the loading of CPU
by letting a transfer contain more data to reduce the number of
transfers.

How is ocp_data used by the hardware?
Shouldn't the calculation also include sizeof(rx_desc) in there somewhere?

The algorithm is from our hw engineers, and it should be

(agg_buf_sz - packet size) / 8

You could refer to commit a59e6d815226 ("r8152: correct the rx early size").

Thanks.

Right now I am working quite hard trying to narrow things down exactly.
You are correct that the driver does appear to be careful about accesses
beyond the filled portion of a URB buffer -- for some reason I thought
the original driver had issues there, but looking again it does not seem to.

One idea that is now looking more likely:
Things could be suffering from speculative CPU accesses to RAM
(the system here has non-coherent d-cache/RAM).
This could incorrectly pre-load data from adjacent URB buffers
into the d-cache, creating coherency issues. I am testing now
with cacheline-sized guard zones between the buffers to see if
that is the issue or not.

Worth repeating: other dongles we have tried, eg. those using the asix driver,
do not cause us any troubles here. Only the r8152 dongles do.

The other drivers do not use hardware checksums, so even if they did
incur similar bad packets, whatever the reason, those bad packets
would be detected/rejected by the Linux network stack (software checksums).
So everything appears to behave fine with them, as it does with
the r8152 driver when hardware checksums are disabled.

Still trying to understand exactly how these errors are happening.
It takes a very long time to do a conclusive test of anything here,
and I only have the hardware for a day or two a week.
So my apologies if I am slow in getting back to you on stuff.

Cheers