On Tue, Jun 17, 2025 at 05:04:55PM -0700, Tantilov, Emil S wrote:
On 5/16/2025 7:58 AM, Larysa Zaremba wrote:
From: Pavan Kumar Linga <pavan.kumar.linga@xxxxxxxxx>
Support to initialize and configure controlqs, and manage their
transactions was introduced in libie. As part of it, most of the existing
controlq structures are renamed and modified. Use those APIs in idpf and
make all the necessary changes.
Previously for the send and receive virtchnl messages, there used to be a
memcpy involved in controlq code to copy the buffer info passed by the send
function into the controlq specific buffers. There was no restriction to
use automatic memory in that case. The new implementation in libie removed
copying of the send buffer info and introduced DMA mapping of the send
buffer itself. To accommodate it, use dynamic memory for the larger send
buffers. For smaller ones (<= 128 bytes) libie still can copy them into the
pre-allocated message memory.
In case of receive, idpf receives a page pool buffer allocated by the libie
and care should be taken to release it after use in the idpf.
The changes are fairly trivial and localized, with a notable exception
being the consolidation of idpf_vc_xn_shutdown and idpf_deinit_dflt_mbx
under the latter name. This has some additional consequences that are
addressed in the following patches.
There is an issue with this approach that impacts the ability of the driver
to force a reset. See below ...
This refactoring introduces roughly additional 40KB of module storage used
for systems that only run idpf, so idpf + libie_cp + libie_pci takes about
7% more storage than just idpf before refactoring.
We now pre-allocate small TX buffers, so that does increase the memory
usage, but reduces the need to allocate. This results in additional 256 *
128B of memory permanently used, increasing the worst-case memory usage by
32KB but our ctlq RX buffers need to be of size 4096B anyway (not changed
by the patchset), so this is hardly noticeable.
As for the timings, the fact that we are mostly limited by the HW response
time which is far from instant, is not changed by this refactor.
Reviewed-by: Michal Kubiak <michal.kubiak@xxxxxxxxx>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@xxxxxxxxx>
Co-developed-by: Larysa Zaremba <larysa.zaremba@xxxxxxxxx>
Signed-off-by: Larysa Zaremba <larysa.zaremba@xxxxxxxxx>
---
drivers/net/ethernet/intel/idpf/Kconfig | 2 +-
drivers/net/ethernet/intel/idpf/Makefile | 2 -
drivers/net/ethernet/intel/idpf/idpf.h | 27 +-
.../net/ethernet/intel/idpf/idpf_controlq.c | 624 -------
.../net/ethernet/intel/idpf/idpf_controlq.h | 130 --
.../ethernet/intel/idpf/idpf_controlq_api.h | 177 --
.../ethernet/intel/idpf/idpf_controlq_setup.c | 171 --
drivers/net/ethernet/intel/idpf/idpf_dev.c | 54 +-
.../net/ethernet/intel/idpf/idpf_ethtool.c | 37 +-
drivers/net/ethernet/intel/idpf/idpf_lib.c | 44 +-
drivers/net/ethernet/intel/idpf/idpf_main.c | 4 -
drivers/net/ethernet/intel/idpf/idpf_mem.h | 20 -
drivers/net/ethernet/intel/idpf/idpf_txrx.h | 2 +-
drivers/net/ethernet/intel/idpf/idpf_vf_dev.c | 60 +-
.../net/ethernet/intel/idpf/idpf_virtchnl.c | 1617 ++++++-----------
.../net/ethernet/intel/idpf/idpf_virtchnl.h | 90 +-
.../ethernet/intel/idpf/idpf_virtchnl_ptp.c | 204 +--
17 files changed, 765 insertions(+), 2500 deletions(-)
delete mode 100644 drivers/net/ethernet/intel/idpf/idpf_controlq.c
delete mode 100644 drivers/net/ethernet/intel/idpf/idpf_controlq.h
delete mode 100644 drivers/net/ethernet/intel/idpf/idpf_controlq_api.h
delete mode 100644 drivers/net/ethernet/intel/idpf/idpf_controlq_setup.c
delete mode 100644 drivers/net/ethernet/intel/idpf/idpf_mem.h
<snip>
diff --git a/drivers/net/ethernet/intel/idpf/idpf_lib.c b/drivers/net/ethernet/intel/idpf/idpf_lib.c
index 68330b884967..500bff1091d9 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_lib.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_lib.c
@@ -1190,6 +1190,7 @@ void idpf_statistics_task(struct work_struct *work)
*/
void idpf_mbx_task(struct work_struct *work)
{
+ struct libie_ctlq_xn_recv_params xn_params = {};
struct idpf_adapter *adapter;
adapter = container_of(work, struct idpf_adapter, mbx_task.work);
@@ -1200,7 +1201,11 @@ void idpf_mbx_task(struct work_struct *work)
queue_delayed_work(adapter->mbx_wq, &adapter->mbx_task,
msecs_to_jiffies(300));
- idpf_recv_mb_msg(adapter, adapter->hw.arq);
+ xn_params.xnm = adapter->xn_init_params.xnm;
+ xn_params.ctlq = adapter->arq;
+ xn_params.ctlq_msg_handler = idpf_recv_event_msg;
+
+ libie_ctlq_xn_recv(&xn_params);
}
/**
@@ -1757,7 +1762,6 @@ static int idpf_init_hard_reset(struct idpf_adapter *adapter)
idpf_vc_core_deinit(adapter);
if (!is_reset)
Since one of the checks in idpf_is_reset_detected() is !adapter->arq, this
will never be possible through the event task. I think we may be able to
remove this check altogether, but as-is this patch introduces large delays
in the Tx hang recovery and depending on the cause may not recover at all.
reg_ops->trigger_reset(adapter, IDPF_HR_FUNC_RESET);
- idpf_deinit_dflt_mbx(adapter);
} else {
dev_err(dev, "Unhandled hard reset cause\n");
err = -EBADRQC;
@@ -1825,7 +1829,7 @@ void idpf_vc_event_task(struct work_struct *work)
return;
func_reset:
- idpf_vc_xn_shutdown(adapter->vcxn_mngr);
+ idpf_deinit_dflt_mbx(adapter);
This is not a straightforward swap, whereas previously we just discard
messages knowing that we cannot communicate with the CP in a reset, this
goes much further as it dismantles the MBX resources, and as a result the
check `if (!is_reset)` in idpf_init_hard_reset() will never be true.
Thanks for finding this!
Given the problem seems to only relate to idpf_vc_event_task() in case of
IDPF_HR_FUNC_RESET and the following call sequence:
idpf_deinit_dflt_mbx(adapter);
set_bit(IDPF_HR_RESET_IN_PROG, adapter->flags);
idpf_init_hard_reset(adapter);
netif_carrier_off();
netif_tx_disable();
is_reset = idpf_is_reset_detected(adapter);
idpf_set_vport_state(adapter);
idpf_vc_core_deinit(adapter);
idpf_deinit_dflt_mbx(adapter);
...
...
I think, it is safe to remove idpf_deinit_dflt_mbx() from idpf_vc_event_task(),
given no mailbox communication is attempted in between it and
idpf_vc_core_deinit(). Anything going on in parallel also should not suffer from
having mailbox available just a little bit longer.
What do you think?
<snip>
Thanks,
Emil