Re: [PATCH 10/10] fault injection: inject faults in new/rare callchains

From: Akinobu Mita
Date: Mon Aug 08 2016 - 12:24:32 EST


2016-08-08 23:07 GMT+09:00 Vegard Nossum <vegard.nossum@xxxxxxxxxx>:
> Hi,
>
> On 08/08/2016 03:54 PM, Akinobu Mita wrote:
>>
>> 2016-08-04 0:05 GMT+09:00 Vegard Nossum <vegard.nossum@xxxxxxxxxx>:
>>>
>>> Before this patch, fault injection uses a combination of randomness and
>>> frequency to determine where to inject faults. The problem with this is
>>> that code paths which are executed very rarely get proportional amounts
>>> of faults injected.
>>>
>>> A better heuristic is to look at the actual callchain leading up to the
>>> possible failure point; if we see a callchain that we've never seen up
>>> until this point, chances are it's a rare one and we should definitely
>>> inject a fault here (since we might not get the chance again later).
>>>
>>> This uses a probabilistic set structure (similar to a bloom filter) to
>>> determine whether we have seen a particular callchain before by hashing
>>> the stack trace and atomically testing/setting a bit corresponding to
>>> the current callchain.
>
> [...]
>
>>> +config FAULT_INJECTION_AT_NEW_CALLSITES
>>> + bool "Inject fault the first time at a new callsite"
>>
>>
>> Isn't it better to make a run time configurable option instead of the
>> build option?
>
>
> I prefer a build option personally since it keeps the code simple (you
> don't have to dynamically allocate the bitmap of known callchains, for
> example). I figured most people using fault injection would enable the
> new option while still allowing others to keep the current behaviour
> if they really want to.
>
> If you prefer a run-time option I can submit a new version.
>

I prefer run-time tunable like
"/sys/kernel/debug/fail*/inject-at-new-callsites" in order to turn
on or off this feature for each fault injection type. I think this
doesn't add too much complexity if the bitmap can be put into
struct fault_attr.