Re: [RFC net-next] bonding: netlink error message support for options

From: Jonathan Toppins
Date: Tue May 17 2022 - 23:38:09 EST


On 5/17/22 19:54, Jakub Kicinski wrote:
On Tue, 17 May 2022 15:44:19 -0700 Stephen Hemminger wrote:
On Tue, 17 May 2022 16:31:19 -0400
Jonathan Toppins <jtoppins@xxxxxxxxxx> wrote:

This is an RFC because the current NL_SET_ERR_MSG() macros do not support
printf like semantics so I rolled my own buffer setting in __bond_opt_set().
The issue is I could not quite figure out the life-cycle of the buffer, if
rtnl lock is held until after the text buffer is copied into the packet
then we are ok, otherwise, some other type of buffer management scheme will
be needed as this could result in corrupted error messages when modifying
multiple bonds.

Might be better for others in long term if NL_SET_ERR_MSG() had printf like
semantics. Surely this isn't going to be first or last case.

Then internally, it could print right to the netlink message.

Dunno. I think pointing at the bad attr + exposing per-attr netlink
parsing policy + a string for a human worked pretty well so far.
IMHO printf() is just a knee jerk reaction, especially when converting
from netdev_err().

For some subsystems it is not a convert from netdev_err, it is an AND. In this RFC there are instances where changing the message from netdev_err() to the macro was trivial;

@@ -240,12 +243,14 @@ static int bond_changelink(struct net_device *bond_dev, st
ruct nlattr *tb[],
int arp_interval = nla_get_u32(data[IFLA_BOND_ARP_INTERVAL]);

if (arp_interval && miimon) {
- netdev_err(bond->dev, "ARP monitoring cannot be used with MII monitoring\n");
+ NL_SET_ERR_MSG(extack,
+ "ARP monitoring cannot be used with MII monitoring");
return -EINVAL;
}

These are trivial because the path does not have to care about sysfs or some other legacy configuration interface. These macros become rather annoying to use once a system needs to support multiple configuration paths and is trying to utilize as much common configuration code[0] as possible so that all interfaces largely operate the same way.


Augmenting structured information is much, much better long term.

To me the never ending stream of efforts to improve printk() is a
proof that once we let people printf() at will, efforts to contain
it will be futile.

At least for bonding I was trying to reuse the most amount of code which needs to deal with both sysfs and netlink. And I don't think it is a good idea to split the code paths, so if I am suppose to use statically allocated strings to support netlink errors that basically means anything that has to support multiple interfaces gets to sprinkle `if (extack)` everywhere[0]. Not great. The ownership model of the error buffer seems odd to me with the current macros, I am suppose to set a pointer in a structure subsystem X didn't allocate and has no control over its lifetime. Then netlink takes this pointer and does whatever with it. And somehow subsystem X is suppose to guarantee the pointer's lifetime exists forever, making a `const static char[]` buffer the only option. I don't understand why netlink doesn't provide the buffer and a subsystem just populates it. Using memcpy or snprintf doesn't matter, to me its a lifetime issue that makes the API not great to work with when you have to handle cases other than netlink.

Also as Joe Perches points out in this thread[1,2] the way the macros are written it is bloating the kernel because the error messages are getting duplicated for subsystems that need to support multiple configuration interfaces.

-Jon

[0] https://lore.kernel.org/netdev/e6b78ce8f5904a5411a809cf4205d745f8af98cb.1628650079.git.jtoppins@xxxxxxxxxx/
[1] https://lore.kernel.org/netdev/cover.1628306392.git.jtoppins@xxxxxxxxxx/
[2] https://lore.kernel.org/netdev/c8b69905c995ab887633ef11862705ee66c60aad.camel@xxxxxxxxxxx/