On Sat, 25 Jul 2020 12:38:00 +0200 Sebastian Gottschall wrote:i agree. i just can say that i tested this patch recently due this discussion here. and it can be changed by sysfs. but it doesnt work for
you may consider thisThanks very much for your link.
https://www.mail-archive.com/linux-kernel@xxxxxxxxxxxxxxx/msg1142611.html
<https://www.mail-archive.com/linux-kernel@xxxxxxxxxxxxxxx/msg1142611.html>
years ago someone already wanted to bring this feature upstream, but itI don't see outstanding difference in principle from Paolo's work in
was denied. i already tested this patch the last 2 days and it worked so
far (with some little modifications)
so such a solution existed already and may be considered here
2016 except for the use of kthread_create() and friends because kworker
made use of them even before 2016. This is a simpler one as shown by
the diff stat in his cover letter.
Paolo, feel free to correct me if I misread anything.
Finally I don't see the need to add sysfs attr, given CONFIG_THREADED_NAPI
in this work.
BTW let us know if anyone has plans to pick up the 2016 RFC.
Hillf
Paolo Abeni (2):
ÂÂ net: implement threaded-able napi poll loop support
ÂÂ net: add sysfs attribute to control napi threaded mode
 include/linux/netdevice.h | 4 ++
 net/core/dev.c | 113 ++++++++++++++++++++++++++++++++++++++++++++++
 net/core/net-sysfs.c | 102 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 219 insertions(+)
Sebastian
someone
Am 25.07.2020 um 10:16 schrieb Hillf Danton:
On Wed, 22 Jul 2020 09:12:42 +0000 David Laight wrote:
Hi folksOn 21 July 2020 18:25 Andrew Lunn wrote:It's not just NAPI the problem is with the softint processing.
On Tue, Jul 21, 2020 at 10:44:19PM +0530, Rakesh Pillai wrote:
NAPI gets scheduled on the CPU core which got theI don't see why this problem is limited to the ath10k driver. I expect
interrupt. The linux scheduler cannot move it to a
different core, even if the CPU on which NAPI is running
is heavily loaded. This can lead to degraded wifi
performance when running traffic at peak data rates.
A thread on the other hand can be moved to different
CPU cores, if the one on which its running is heavily
loaded. During high incoming data traffic, this gives
better performance, since the thread can be moved to a
less loaded or sometimes even a more powerful CPU core
to account for the required CPU performance in order
to process the incoming packets.
This patch series adds the support to use a high priority
thread to process the incoming packets, as opposed to
everything being done in NAPI context.
it applies to all drivers using NAPI. So shouldn't you be solving this
in the NAPI core? Allow a driver to request the NAPI core uses a
thread?
I suspect a lot of systems would work better if it ran as
a (highish priority) kernel thread.
Below is a minimunm poc implementation I can imagine on top of workqueue
to make napi threaded. Thoughts are appreciated.
I've had to remove the main locks from a multi-threaded applicationTo make napi threaded, if either irq or softirq thread is entirely ruled
and replace them with atomic counters.
Consider what happens when the threads remove items from a shared
work list.
The code looks like:
ÂÂÂÂmutex_enter();
ÂÂÂÂremove_item_from_list();
ÂÂÂÂmutex_exit().
The mutex is only held for a few instructions, so while you'd expect
the cache line to be 'hot' you wouldn't get real contention.
However the following scenarios happen:
1) An ethernet interrupt happens while the mutex is held.
ÂÂÂÂ This stops the other threads until all the softint processing
ÂÂÂÂ has finished.
2) An ethernet interrupt (and softint) runs on a thread that is
ÂÂÂÂ waiting for the mutex.
ÂÂÂÂ (Or on the cpu that the thread's processor affinity ties it to.)
ÂÂÂÂ In this case the 'fair' (ticket) mutex code won't let any other
ÂÂÂÂ thread acquire the mutex.
ÂÂÂÂ So again everything stops until the softints all complete.
The second one is also a problem when trying to wake up all
the threads (eg after adding a lot of items to the list).
The ticket locks force them to wake in order, but
sometimes the 'thundering herd' would work better.
IIRC this is actually worse for processes running under the RT
scheduler (without CONFIG_PREEMPT) because the they are almost
always scheduled on the same cpu they ran on last.
If it is busy, but cannot be pre-empted, they are not moved
to an idle cpu.
ÂÂÂÂ To confound things there is a very broken workaround for broken
hardware in the driver for the e1000 interface on (at least)
Ivy Bridge cpu that can cause the driver to spin for a very
long time (IIRC milliseconds) whenever it has to write to a
MAC register (ie on every transmit setup).
ÂÂÂÂDavid
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
out, add napi::work that will be queued on a highpri workqueue. It is
actually a unbound one to facilitate scheduler to catter napi loads on to
idle CPU cores. What users need to do with the threaded napi
is s/netif_napi_add/netif_threaded_napi_add/ and no more.
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -338,6 +338,9 @@ struct napi_struct {
ÂÂÂÂÂÂ struct list_headÂÂÂ dev_list;
ÂÂÂÂÂÂ struct hlist_nodeÂÂÂ napi_hash_node;
ÂÂÂÂÂÂ unsigned intÂÂÂÂÂÂÂ napi_id;
+#ifdef CONFIG_THREADED_NAPI
+ÂÂÂ struct work_structÂÂÂ work;
+#endif
ÂÂ };
ÂÂ ÂÂ enum {
@@ -2234,6 +2237,19 @@ static inline void *netdev_priv(const st
ÂÂ void netif_napi_add(struct net_device *dev, struct napi_struct *napi,
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ int (*poll)(struct napi_struct *, int), int weight);
ÂÂ +#ifdef CONFIG_THREADED_NAPI
+void netif_threaded_napi_add(struct net_device *dev, struct napi_struct *napi,
+ÂÂÂÂÂÂÂÂÂÂÂ int (*poll)(struct napi_struct *, int), int weight);
+#else
+static inline void netif_threaded_napi_add(struct net_device *dev,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ struct napi_struct *napi,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ int (*poll)(struct napi_struct *, int),
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ int weight)
+{
+ÂÂÂ netif_napi_add(dev, napi, poll, weight);
+}
+#endif
+
ÂÂ /**
ÂÂÂ *ÂÂÂ netif_tx_napi_add - initialize a NAPI context
ÂÂÂ *ÂÂÂ @dev:Â network device
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6277,6 +6277,61 @@ static int process_backlog(struct napi_s
ÂÂÂÂÂÂ return work;
ÂÂ }
ÂÂ +#ifdef CONFIG_THREADED_NAPI
+/* unbound highpri workqueue for threaded napi */
+static struct workqueue_struct *napi_workq;
+
+static void napi_workfn(struct work_struct *work)
+{
+ÂÂÂ struct napi_struct *n = container_of(work, struct napi_struct, work);
+
+ÂÂÂ for (;;) {
+ÂÂÂÂÂÂÂ if (!test_bit(NAPI_STATE_SCHED, &n->state))
+ÂÂÂÂÂÂÂÂÂÂÂ return;
+
+ÂÂÂÂÂÂÂ if (n->poll(n, n->weight) < n->weight)
+ÂÂÂÂÂÂÂÂÂÂÂ return;
+
+ÂÂÂÂÂÂÂ if (need_resched()) {
+ÂÂÂÂÂÂÂÂÂÂÂ /*
+ÂÂÂÂÂÂÂÂÂÂÂÂ * have to pay for the latency of task switch even if
+ÂÂÂÂÂÂÂÂÂÂÂÂ * napi is scheduled
+ÂÂÂÂÂÂÂÂÂÂÂÂ */
+ÂÂÂÂÂÂÂÂÂÂÂ if (test_bit(NAPI_STATE_SCHED, &n->state))
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ queue_work(napi_workq, work);
+ÂÂÂÂÂÂÂÂÂÂÂ return;
+ÂÂÂÂÂÂÂ }
+ÂÂÂ }
+}
+
+void netif_threaded_napi_add(struct net_device *dev,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ struct napi_struct *napi,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ int (*poll)(struct napi_struct *, int),
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ int weight)
+{
+ÂÂÂ netif_napi_add(dev, napi, poll, weight);
+ÂÂÂ INIT_WORK(&napi->work, napi_workfn);
+}
+
+static inline bool is_threaded_napi(struct napi_struct *n)
+{
+ÂÂÂ return n->work.func == napi_workfn;
+}
+
+static inline void threaded_napi_sched(struct napi_struct *n)
+{
+ÂÂÂ if (is_threaded_napi(n))
+ÂÂÂÂÂÂÂ queue_work(napi_workq, &n->work);
+ÂÂÂ else
+ÂÂÂÂÂÂÂ ____napi_schedule(this_cpu_ptr(&softnet_data), n);
+}
+#else
+static inline void threaded_napi_sched(struct napi_struct *n)
+{
+ÂÂÂ ____napi_schedule(this_cpu_ptr(&softnet_data), n);
+}
+#endif
+
ÂÂ /**
ÂÂÂ * __napi_schedule - schedule for receive
ÂÂÂ * @n: entry to schedule
@@ -6289,7 +6344,7 @@ void __napi_schedule(struct napi_struct
ÂÂÂÂÂÂ unsigned long flags;
ÂÂ ÂÂÂÂÂÂ local_irq_save(flags);
-ÂÂÂ ____napi_schedule(this_cpu_ptr(&softnet_data), n);
+ÂÂÂ threaded_napi_sched(n);
ÂÂÂÂÂÂ local_irq_restore(flags);
ÂÂ }
ÂÂ EXPORT_SYMBOL(__napi_schedule);
@@ -6335,7 +6390,7 @@ EXPORT_SYMBOL(napi_schedule_prep);
ÂÂÂ */
ÂÂ void __napi_schedule_irqoff(struct napi_struct *n)
ÂÂ {
-ÂÂÂ ____napi_schedule(this_cpu_ptr(&softnet_data), n);
+ÂÂÂ threaded_napi_sched(n);
ÂÂ }
ÂÂ EXPORT_SYMBOL(__napi_schedule_irqoff);
ÂÂ @@ -10685,6 +10740,10 @@ static int __init net_dev_init(void)
ÂÂÂÂÂÂÂÂÂÂ sd->backlog.weight = weight_p;
ÂÂÂÂÂÂ }
ÂÂ +#ifdef CONFIG_THREADED_NAPI
+ÂÂÂ napi_workq = alloc_workqueue("napi_workq", WQ_UNBOUND | WQ_HIGHPRI,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ WQ_UNBOUND_MAX_ACTIVE);
+#endif
ÂÂÂÂÂÂ dev_boot_phase = 0;
ÂÂ ÂÂÂÂÂÂ /* The loopback device is special if any other network devices
_______________________________________________
ath10k mailing list
ath10k@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/ath10k