RE: [EXT] Re: [v2,net-next, 1/2] enetc: Configure the Time-Aware Scheduler via tc-taprio offload

From: Po Liu
Date: Tue Nov 12 2019 - 22:45:25 EST


Hi Ivan,

> -----Original Message-----
> From: Ivan Khoronzhuk <ivan.khoronzhuk@xxxxxxxxxx>
> Sent: 2019年11月13日 5:10
> To: Po Liu <po.liu@xxxxxxx>
> Cc: Claudiu Manoil <claudiu.manoil@xxxxxxx>; davem@xxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; vinicius.gomes@xxxxxxxxx;
> Vladimir Oltean <vladimir.oltean@xxxxxxx>; Alexandru Marginean
> <alexandru.marginean@xxxxxxx>; Xiaoliang Yang
> <xiaoliang.yang_1@xxxxxxx>; Roy Zang <roy.zang@xxxxxxx>; Mingkai Hu
> <mingkai.hu@xxxxxxx>; Jerry Huang <jerry.huang@xxxxxxx>; Leo Li
> <leoyang.li@xxxxxxx>
> Subject: [EXT] Re: [v2,net-next, 1/2] enetc: Configure the Time-Aware Scheduler
> via tc-taprio offload
>
> Caution: EXT Email
>
> Hello,
>
> On Tue, Nov 12, 2019 at 08:42:49AM +0000, Po Liu wrote:
> >ENETC supports in hardware for time-based egress shaping according to
> >IEEE 802.1Qbv. This patch implement the Qbv enablement by the hardware
> >offload method qdisc tc-taprio method.
> >Also update cbdr writeback to up level since control bd ring may
> >writeback data to control bd ring.
> >
> >Signed-off-by: Po Liu <Po.Liu@xxxxxxx>
> >Signed-off-by: Vladimir Oltean <vladimir.oltean@xxxxxxx>
> >Signed-off-by: Claudiu Manoil <claudiu.manoil@xxxxxxx>
> >---
> >changes:
> >- introduce a local define CONFIG_FSL_ENETC_QOS to fix the various
> > configurations will result in link errors.
> > Since the CONFIG_NET_SCH_TAPRIO depends on many Qos configs. Not
> > to use it directly in driver. Add it to CONFIG_FSL_ENETC_QOS depends
> > on list, so only CONFIG_NET_SCH_TAPRIO enabled, user can enable this
> > tsn feature, or else, return not support.
> >
> > drivers/net/ethernet/freescale/enetc/Kconfig | 10 ++
> > drivers/net/ethernet/freescale/enetc/Makefile | 1 +
> > drivers/net/ethernet/freescale/enetc/enetc.c | 19 ++-
> > drivers/net/ethernet/freescale/enetc/enetc.h | 7 +
> > .../net/ethernet/freescale/enetc/enetc_cbdr.c | 5 +-
> > .../net/ethernet/freescale/enetc/enetc_hw.h | 150 ++++++++++++++++--
> > .../net/ethernet/freescale/enetc/enetc_qos.c | 130 +++++++++++++++
> > 7 files changed, 300 insertions(+), 22 deletions(-) create mode 100644
> > drivers/net/ethernet/freescale/enetc/enetc_qos.c
> >
>
> [...]
>
> >
> >@@ -1483,6 +1479,19 @@ int enetc_setup_tc(struct net_device *ndev, enum
> tc_setup_type type,
> > return 0;
> > }
> >
> >+int enetc_setup_tc(struct net_device *ndev, enum tc_setup_type type,
> >+ void *type_data)
> >+{
> >+ switch (type) {
> >+ case TC_SETUP_QDISC_MQPRIO:
> >+ return enetc_setup_tc_mqprio(ndev, type_data);
> Sorry didn't see v2, so i duplicate my question here:
>
> This patch is for taprio offload, I see that mqprio is related and is part of taprio
> offload configuration. But taprio offload has own mqprio settings.
> The taprio mqprio part is not offloaded with TC_SETUP_QDISC_MQPRIO.
>
> So, a combination of mqprio and tario qdiscs used.
> Could you please share the commands were used for your setup?
>

Example command:
tc qdisc replace dev eth0 parent root handle 100 taprio num_tc 8 map 0 1 2 3 4 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 01 sched-entry S 02 300000 flags 0x2

> And couple interesting questions about all of this:
> - The taprio qdisc has to have mqprio settings, but if it's done with mqprio then
> it just skipped (by reading tc class num).
> - If no separate mqprio qdisc configuration then mqprio conf from taprio is set,
> who should restore tc mappings when taprio qdisc is unloaded?
> Maybe there is reason to implement TC_SETUP_QDISC_MQPRIO offload in
> taprio since it's required feature?

Mqprio offload with TC_SETUP_QDISC_MQPRIO would be good or even the plus with num_tc would fix some hack. This has a discussion with Vinicius Gomes for a future patch fix.
I know the problem is the mqprio will be set after the offload function. But offload function may use some for hardware set.

> Would be better to move changes for mqprio in separate patch with
> explanation.
>
> >+ case TC_SETUP_QDISC_TAPRIO:
> >+ return enetc_setup_tc_taprio(ndev, type_data);
> >+ default:
> >+ return -EOPNOTSUPP;
> >+ }
> >+}
> >+
>
> [...]
>
> >diff --git a/drivers/net/ethernet/freescale/enetc/enetc_qos.c
> >b/drivers/net/ethernet/freescale/enetc/enetc_qos.c
> >new file mode 100644
> >index 000000000000..036bb39c7a0b
> >--- /dev/null
> >+++ b/drivers/net/ethernet/freescale/enetc/enetc_qos.c
>
> [...]
>
> >+static int enetc_setup_taprio(struct net_device *ndev,
> >+ struct tc_taprio_qopt_offload *admin_conf)
> >+{
> >+ struct enetc_ndev_priv *priv = netdev_priv(ndev);
> >+ struct enetc_cbd cbd = {.cmd = 0};
> >+ struct tgs_gcl_conf *gcl_config;
> >+ struct tgs_gcl_data *gcl_data;
> >+ struct gce *gce;
> >+ dma_addr_t dma;
> >+ u16 data_size;
> >+ u16 gcl_len;
> >+ u32 temp;
> >+ int i;
> >+
> >+ gcl_len = admin_conf->num_entries;
> >+ if (gcl_len > enetc_get_max_gcl_len(&priv->si->hw))
> >+ return -EINVAL;
> >+
> >+ if (admin_conf->enable) {
> >+ enetc_wr(&priv->si->hw,
> >+ ENETC_QBV_PTGCR_OFFSET,
> >+ temp & (~ENETC_QBV_TGE));
> >+ usleep_range(10, 20);
> >+ enetc_wr(&priv->si->hw,
> >+ ENETC_QBV_PTGCR_OFFSET,
> >+ temp | ENETC_QBV_TGE);
> >+ } else {
> >+ enetc_wr(&priv->si->hw,
> >+ ENETC_QBV_PTGCR_OFFSET,
> >+ temp & (~ENETC_QBV_TGE));
> >+ return 0;
> >+ }
>
> Better do the upper qbv enable/disable procedure closer to enetc_send_cmd()
> or at least after kzalloc that can fail, no need to restore then.

After saving the 'if' suggest by Simon Horman, the enable part could move close to the cmd().

>
> >+
> >+ /* Configure the (administrative) gate control list using the
> >+ * control BD descriptor.
> >+ */
> >+ gcl_config = &cbd.gcl_conf;
> >+
> >+ data_size = sizeof(struct tgs_gcl_data) + gcl_len *
> >+ sizeof(struct gce);
> >+
> >+ gcl_data = kzalloc(data_size, __GFP_DMA | GFP_KERNEL);
> >+ if (!gcl_data)
> >+ return -ENOMEM;
> >+
> >+ gce = (struct gce *)(gcl_data + 1);
> >+
> >+ /* Since no initial state config in taprio, set gates open as default.
> >+ */
> tc-taprio and IEEE Qbv allows to change configuration in flight, so that oper
> state is active till new admin start time. So, here comment says it does initial
> state config, if in-flight feature is not supported then error has to be returned
> instead of silently rewriting configuration. But if it can be implemented then
> state should be remembered/verified in order to not brake oper configuration?

I think this is ok as per standard. Also see this comment in
net/sched/sch_taprio.c:

/* Until the schedule starts, all the queues are open */
I would change the comment.

> >+ gcl_config->atc = 0xff;
> >+ gcl_config->acl_len = cpu_to_le16(gcl_len);
>
> Ok, this is maximum number of schedules.
> According to tc-taprio it's possible to set cycle period more then schedules
> actually can consume. If cycle time is more, then last gate's state can be kept
> till the end of cycle. But if last schedule has it's own interval set then gates
> should be closed till the end of cycle or no? if it has to be closed, then one more
> endl schedule should be present closing gates at the end of list for the rest cycle
> time. Can be implemented in h/w but just to be sure, how it's done in h/w?
>
There is already check the list len in up code.
if (admin_conf->num_entries > enetc_get_max_gcl_len(&priv->si->hw))
return -EINVAL;
gcl_len = admin_conf->num_entries;

> >+
> >+ if (!admin_conf->base_time) {
> >+ gcl_data->btl =
> >+ cpu_to_le32(enetc_rd(&priv->si->hw, ENETC_SICTR0));
> >+ gcl_data->bth =
> >+ cpu_to_le32(enetc_rd(&priv->si->hw, ENETC_SICTR1));
> >+ } else {
> >+ gcl_data->btl =
> >+ cpu_to_le32(lower_32_bits(admin_conf->base_time));
> >+ gcl_data->bth =
> >+ cpu_to_le32(upper_32_bits(admin_conf->base_time));
> >+ }
> >+
> >+ gcl_data->ct = cpu_to_le32(admin_conf->cycle_time);
> >+ gcl_data->cte = cpu_to_le32(admin_conf->cycle_time_extension);
>
> Not sure it's good idea to write values w/o verification, The cycle time and time
> extension is 64 values, so it's supposed them to be more then 4,3 seconds, it's
> probably not a case, but better return error if it's more.
>

Can add a check for cycle time and extension since type not match.

> >+
> >+ for (i = 0; i < gcl_len; i++) {
> >+ struct tc_taprio_sched_entry *temp_entry;
> >+ struct gce *temp_gce = gce + i;
> >+
> >+ temp_entry = &admin_conf->entries[i];
> >+
> >+ temp_gce->gate = cpu_to_le32(temp_entry->gate_mask);
>
> So, gate_mask can have up to 32 traffic classes? :-|.

Gate_mask is defined u32 type. Simon has suggested endian issue. Would change in next patch.

>
> >+ temp_gce->period = cpu_to_le32(temp_entry->interval);
>
> So, the interval can be up to 4.3 seconds for one schedule?
> That is, one cycle can be one schedule.
> great.
>
> >+ }
>
> There is no schedule cmd set, so only SetGateStates is supported?
> But anyway it's Ok.
>
> >+
> >+ cbd.length = cpu_to_le16(data_size);
> >+ cbd.status_flags = 0;
> >+
> >+ dma = dma_map_single(&priv->si->pdev->dev, gcl_data,
> >+ data_size, DMA_TO_DEVICE);
> >+ if (dma_mapping_error(&priv->si->pdev->dev, dma)) {
> >+ netdev_err(priv->si->ndev, "DMA mapping failed!\n");
> >+ kfree(gcl_data);
> >+ return -ENOMEM;
> >+ }
> >+
> >+ cbd.addr[0] = lower_32_bits(dma);
> >+ cbd.addr[1] = upper_32_bits(dma);
> >+ cbd.cls = BDCR_CMD_PORT_GCL;
> >+
> >+ /* Updated by ENETC on completion of the configuration
> >+ * command. A zero value indicates success.
> >+ */
> >+ cbd.status_flags = 0;
>
> It's updated on completion by setting 0 on success, then why it's here set to 0
> but not 1 and not verified afterwards?
>

This byte is feedback by hardware after enetc_send_cmd. Hardware require the cbd space set status_flags 0 before send to hardware.
No choice of larger than 0 before send to hardware.
But you mind me the enetc_send_cmd() need to check return value.

> >+
> >+ enetc_send_cmd(priv->si, &cbd);
> >+
> >+ dma_unmap_single(&priv->si->pdev->dev, dma, data_size,
> DMA_TO_DEVICE);
> >+ kfree(gcl_data);
> >+
> >+ return 0;
> >+}
> >+
> >+int enetc_setup_tc_taprio(struct net_device *ndev, void *type_data) {
> >+ struct tc_taprio_qopt_offload *taprio = type_data;
> >+ struct enetc_ndev_priv *priv = netdev_priv(ndev);
> >+ int i;
> >+
> >+ for (i = 0; i < priv->num_tx_rings; i++)
> >+ enetc_set_bdr_prio(&priv->si->hw,
> >+ priv->tx_ring[i]->index,
> >+ taprio->enable ? i : 0);
>
> then why enable/disable at the beginning for whole qbv scheduler, maybe
> better do it together? Or better say, what if setup_taprio failed, who restore
> configuration?

You remind me the enetc_send_cmd() need to check return value.

> >+
> >+ return enetc_setup_taprio(ndev, taprio); }
> >--
> >2.17.1
> >
>
> --
> Regards,
> Ivan Khoronzhuk