Re: [PATCH] kernel/kthread.c: need spin_lock_irq() for 'worker' beforemain looping, since it can "WARN_ON(worker->task)".

From: Chen Gang
Date: Thu Jun 20 2013 - 03:38:54 EST


On 06/20/2013 03:02 PM, Thomas Gleixner wrote:
> On Thu, 20 Jun 2013, Chen Gang wrote:
>
>> > On 06/19/2013 11:52 PM, Tejun Heo wrote:
>>> > > On Wed, Jun 19, 2013 at 06:17:36PM +0800, Chen Gang wrote:
>>>>> > >> > Hmm... can 'worker->task' has chance to be not NULL before set 'current'
>>>>> > >> > to it ?
>>> > > Yes, if the caller screws up and try to attach more than one workers
>>> > > to the kthread_worker, which has some possibility of happening as
>>> > > kthread_worker allows both attaching and detaching a worker.
>>> > >
>> >
>> > If we detect the bugs, and still want to use WARN_ON() to report warning
>> > and continue running, we need be sure of keeping the related things no
>> > touch (at least not lead to worse).
>> >
>> > If we can not be sure of keeping the related things no touch:
>> > if it is a kernel bug, better use BUG_ON() instead of,
>> > if it is a user mode bug, better to return failure with error code and
>> > print related information.
> Wrong. BUG_ON() is only for cases where the kernel CANNOT continue at
> all. WARN_ON() prints the very same information, but allows to
> continue.
>

In fact, BUG_ON() and WARN_ON() has various implementations in different
architectures, and also can be configured by user.

Even some of 'crazy users' (e.g. randconfig), can make BUG_ON() and
WARN_ON() 'empty' (include/asm-generic/bug.h).

In my experience (mainly for servers), when find a kernel bug, it will
stop and report bug, that will let coredump analysing (or KDB trap) much
easier.


>> > BUG_ON() will stop current working flow and report kernel bug in details.
> There is no reason to crash the machine completely. The kernel can
> continue and the WARN_ON reports the bug with the same details.

If so (we still prefer to use WARN_ON), we'd better to let it in lock
protected.

At least when we still have to continue, try not to lead things worse.

It will provide much help for coredump analysing (or KDB trap).


In fact, for coredump analysers, for every real world coredump, they
have to assume the system has already continued blindly, and then die.


Thanks.
--
Chen Gang

Asianux Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/