Re: [PATCH] fix PHY polling system blocking

From: Stefani Seibold
Date: Sun Mar 21 2010 - 17:54:32 EST


I had now analyzed the PHY handling in most of the network drivers. Most
of the PHY communication will be handled in a polling/blocking way,
write a command word and then wait for the results. Due the nature of
the PHY attachment, this will take some time.

Some of the network drivers do this polling/blocking also in atomic code
paths, like interrupts or timer. So activities on the PHY can cause huge
latency jitters.

On the other side, most of the network driver handle the PHY without
using or only partially using the phylib.

The phylib has also a drawback, because it polls the PHY despite if it
has interrupt support for it or not. I can't see a reason for this
behavior.

So the problem of huge latencies by polling the PHY occurs in most of
the network drivers. For example have a look at the e100 network driver
in the file drivers/net/e100.c, function mdio_ctrl_hw(): This function
will poll for max. of 4000 us or 4 ms.

To fix this latency jitter problem with the PHY polling there are the
following steps to do:

- disable polling in driver/net/phy.c if an interrupt for the PHY is
available
- create an own single or per cpu workqueue for the phylib, so that the
PHY specific code can temporary schedule or block
- prevent all current user of the phylib to access the PHY in a atomic
code path
- modify all current users of the phylib from using cpu_relax() to
cond_resched() and replace the counters against inquiring a timeout
- modify all other network drivers to use the phylib

What do you think?

Stefani


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/