Re: [PATCH] Net device error logging, revised

From: Jim Keniston
Date: Wed Sep 17 2003 - 13:00:42 EST

Next message: john stultz: "[PATCH] linux-2.6.0-test5_hangcheck-compile-fix_A0"
Previous message: Grover, Andrew: "[PATCH] ACPI maintainer change"
In reply to: Feldman, Scott: "RE: [PATCH] Net device error logging, revised"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

"Feldman, Scott" wrote:
>
> > 3. A new macro, netdev_fatal, is included. Given the call
> > netdev_fatal(dev, HW, "NIC fried!\n");
> > the indicated message is always logged: the msglevel arg (HW, in this
> > case) is NOT consulted. In fact, the msglevel arg to netdev_fatal
> > is ignored in this implementation. (As previously discussed, in some
> > future implementation, the msglevel could be logged to help indicate
> > the circumstances under which the event was logged.)
>
> I couldn't find the previous discussion on netdev_fatal, so sorry if
> this has already been worked out. It uses KERN_ERR; did you mean
> something stronger? If not, why not just use netdev_err(dev, ALL,
> "...")? What is the situation in the driver where we'd want to use
> _fatal? How do I know when to use _fatal and when to use _err?
>
> -scott

Good question. There was a discussion thread where somebody pared down
the cc list (no LKML, no netdev, an apparently no Scott). I saved most
of the messages, and can send them along to you if you want.

Anyway, in support of passing the NETIF_MSG_* msglevel as an arg to the
netdev_* macros, I said that it...
> also opens the door for logging more clearly what part of the driver (PROBE,
> TX, RX, etc.) the message comes from.

David Brownell replied:
> This is a different issue. Why wouldn't it be enough
> to have unique messages?

I replied:
> First, I don't want acceptance of the netdev_* idea to hinge on this
> minor issue. Everything below is blue(r) sky...
>
> That said, it could be useful for an error-analysis utility to know that
> the message is (say) a WARNING categorized as an RX_ERR, especially if the
> utility hasn't been configured to recognized that specific message string.
> That's probably info that wouldn't come to you via printk (e.g., since
> syslog discards the severity level), but could come via another logging
> or monitoring system that plugs in (along with printk) via netdev_*.
>
> One problem with this idea is that, as I read Becker's document
> (URL above), "fatal" errors should always be logged, and therefore
> not tagged as PROBE, TX_ERR, etc. If we wanted to pursue this idea,
> we could implement
> netdev_fatal(dev, PROBE, "...")
> which logs the message unconditionally (KERN_ERR) and categorizes it as
> a PROBE error.

Jeff Garzik replied:
> printk'ing probe errors is definitely something I want to encourage.
> Probe errors are the most visible indication of what went wrong, if the
> driver fails to load. These are the messages that help developers
> figure out the problem. So I like the direction you suggest, in "If we
> wanted to pursue[...]"

In summary, the idea for netdev_fatal() is based on the idea that the msglevel
arg can not only be used for verbosity control, but can be logged as yet
another useful piece of info. For printk-based logging, it could easily be
added to the message prefix:
eth2: PROBE: Invalid Ethernet address
If you log this message using
netdev_err(dev, PROBE, "Invalid Ethernet address\n")
it will be logged only if the PROBE flag is set in the msg_enable bitmap.
If you log it using
netdev_fatal(dev, PROBE, "Invalid Ethernet address\n")
it will always be logged.

As mentioned above, I don't think the netdev_fatal idea is central to the
whole netdev_printk idea.

By the way, regarding using KERN_ERR vs. something stronger: It's pretty
unusual for a driver to log anything as KERN_CRIT or worse -- presumably
due to a desire not to overreact if the nth of n redundant devices happens
to fail. So unless we can agree on a definition of KERN_CRIT (e.g., it
applies to any failure that makes a device unavailable pending human
intervention), I'd prefer to steer clear of that.

Jim
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: john stultz: "[PATCH] linux-2.6.0-test5_hangcheck-compile-fix_A0"
Previous message: Grover, Andrew: "[PATCH] ACPI maintainer change"
In reply to: Feldman, Scott: "RE: [PATCH] Net device error logging, revised"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]