Re: sendfile() broken with 2.6.26 + Apache 2 ?

From: Holger Hoffstaette
Date: Wed Jul 16 2008 - 11:00:22 EST


On Wed, 16 Jul 2008 11:08:04 +0200, Eric Dumazet wrote:

> Ian Jeffray a écrit :
>> Hi Eric,
>>
>> Thanks for directing me to a better list.
>>
>> Further responses below:
>>
>> Eric Dumazet wrote:
>>> CC to netdev where this report might find better answers
>>>
>>> Ian Jeffray a écrit :
>>>> All,
>>>>
>>>> I moved from kernel 2.6.25.4 to 2.6.26 yesterday and observed that
>>>> large files sent via Apache2 are partially corrupt.
>>>>
>>>> This appears to be linked to sendfile() -- disabling the use of
>>>> sendfile in the apache config (EnableSendfile Off) allows it to
>>>> function as normal.
>>>>
>>>> My system is a simple Core2Duo running Debian lenny/sid; nothing
>>>> special, and I have never observed problems like this before.
>>>>
>>>> The problem feels certainly related to sendfile() since the data reads
>>>> correctly from disc in other programs, and via CIFS etc.
>>>>
>>>> The corruption happens part-way in to the file... I've no exact figure
>>>> but it would seem like maybe 32KB -- I'm seeing broken PNGs served
>>>> from Apache, where the top few dozen lines decode correctly, and the
>>>> rest is garbage.
>>>>
>>>> I've made basically no configuration changes between 2.6.25.4 and
>>>> 2.6.26 and have explicitly tried both enabling and disabling the new
>>>> PAT support to no effect.
>>>>
>>>> This is completely repeatable and reproducible.
>>>>
>>>> Is anyone else seeing this broken behaviour?
>>>>
>>>>
>>>
>>> What kind of network adapter are you using ? (lspci | grep -i ether)
>>
>> 02:00.0 Ethernet controller: Attansic Technology Corp. L1 Gigabit
>> Ethernet Adapter (rev b0)
>>
>>
>>> If you disable TCP segmentation offload on this NIC (ethtool -K eth0
>>> tso off) , is this problem still present ?
>>
>> Wow. That 'solves' the problem! Great.
>>
>> Does this therefore point to an attansic driver issue?
>
> Yes, maybe related to commit 9d90fb1ac9d97da86e24d9ea947bf2a2f333829a
>
> In this patch, Jay Cliburn enabled TSO by default for atl1 driver.
>
> This might be a driver problem, or a generic sendfile() problem, I dont
> know...

Maybe related to http://lkml.org/lkml/2007/12/6/229 ?
I switched to a different server with e1000 NIC in the meantime and so I
cannot test if this is still a problem in 2.6.26, but apparently it seems
so. For me the combo e1000/e1000e + sendfile works reliably..

Holger


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/