Re: [PATCH] xfs: fix possible NULL dereference

From: Geyslan GregÃrio Bem
Date: Tue Oct 22 2013 - 06:12:57 EST


2013/10/21 Dave Chinner <david@xxxxxxxxxxxxx>:
> On Mon, Oct 21, 2013 at 07:00:59PM -0500, Eric Sandeen wrote:
>> On 10/21/13 6:56 PM, Dave Chinner wrote:
>> > On Mon, Oct 21, 2013 at 06:18:49PM -0500, Ben Myers wrote:
>> >> Hey,
>> >>
>> >> On Mon, Oct 21, 2013 at 06:12:18PM -0500, Eric Sandeen wrote:
>> >>> On 10/21/13 5:44 PM, Dave Chinner wrote:
>> >>>> On Mon, Oct 21, 2013 at 03:58:23PM -0500, Eric Sandeen wrote:
>> >>>>> On 10/21/13 1:32 PM, Geyslan G. Bem wrote:
>> >>>>>> This patch puts a 'break' in the true branch, avoiding the 'icptr->ic_next'
>> >>>>>> dereferencing.
>> >>>>>
>> >>>>> Reviewed-by: Eric Sandeen <sandeen@xxxxxxxxxx>
>> >>>>
>> >>>> Actually, NACK.
>> >>>
>> >>> I felt that one coming ;)
>> >>>
>> >>>>> Hm, yeah - cmn_err(CE_PANIC, " " ); used to BUG_ON, but the newer
>> >>>>> xfs_emerg() doesn't.
>> >>>>>
>> >>>>> Dave, was that intentional?
>> >>>>
>> >>>> Of course it was. ;) xfs_emerg() is only called from the debug code
>> >>>> in xlog_verify_iclog(), xlog_verify_tail_lsn and assfail().
>> >>>>
>> >>>> In the case of assfail(), it has it's own BUG() call, so it does
>> >>>> everything just fine.
>> >>>>
>> >>>> In the case of xlog_verify_iclog() when icptr is NULL, it will
>> >>>> panic immediately after the message is printed, just like the old
>> >>>> code. i.e. this patch isn't fixing anything we need fixed.
>> >>>
>> >>> A BUG() is probably warranted, then.
>> >>
>> >> I tend to agree with Eric on this point. If we want to crash, I'd rather our
>> >> intent be very clear, rather than just see a null ptr deref. ;)
>> >
>> > Sure. ASSERT() would be better and more consistent with the rest of
>> > the code. i.e:
>> >
>> > for (i = 0; i < log->l_iclog_bufs; i++, icptr = icptr->ic_next)
>> > ASSERT(icptr);
>> >
>> > <Devil's Advocate>
>> >
>> > However, I keep coming back to the fact that what we are checking is
>> > that the list is correctly circular and that and adding an
>> > ASSERT(icptr) to panic if a pointer chase finds a null pointer is
>> > kinda redundant, especially as:
>> >
>> > - there's already 2 comments for the function indicating
>> > that it is checking the validity of the pointers and that
>> > they are circular....
>> > - we have repeatedly, over many years, justified the removal
>> > of ASSERT(ptr) from code like:
>> >
>> > ASSERT(ptr);
>> > foo = ptr->foo;
>> >
>> > as it is redundant because production code will always
>> > panic the machine in that situation via the dereference.
>> > ASSERT() is for documenting assumptions and constraints
>> > that are not obvious from the code context.
>> >
>> > IOWs, in this case the presence or absence of the ASSERT inside
>> > *debug-only code* doesn't add any addition value to debugging such
>> > problems, nor does it add any value in terms of documentation
>> > because it's clear from the comments in the debug code that it
>> > should not be NULL to begin with.
>> >
>> > </Devil's Advocate>
>>
>> I guess what's left as unclear is why we would prefer to panic
>> vs. handling the error, even if it's in debug code. The caller can
>> handle errors, so blowing up here sure doesn't look intentional.
>
> If the iclog list is corrupt and not circular, then we will simply
> panic the next time it is iterated. Lots of log code iterates the
> iclog list, such as log IO completion, switching iclogs, log forces,
> etc.
>
>> Maybe the answer is it's debug code
>> and we want to drop to the debugger or generate a vmcore at that
>> point, but that's just been demonstrated as quite unclear to the
>> casual reader. :)
>
> Yes, but to continue the Devil's Advocate argument, the purpose of
> debug code isn't to enlightent the casual reader or drive-by
> patchers - it's to make life easier for people who actually spend
> time debugging the code. And the people who need the debug code
> are expected to understand why an ASSERT is not necessary. :)
>
Dave, Eric and Ben,

This was catched by coverity (CID 102348).

Well, I got it now that was intentional and changed it in Coverity Triage.

But I must assume that it is not the standard debugging, then I
suggest you put at least a comment explaining why it does dereference
after NULL check.

Cheers.

> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/