Re: [PATCH v3 bpf-next 2/3] bpf: introduce helper bpf_get_branch_snapshot

From: Song Liu
Date: Tue Aug 31 2021 - 12:41:25 EST




> On Aug 31, 2021, at 8:32 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Mon, Aug 30, 2021 at 02:41:05PM -0700, Song Liu wrote:
>
>> @@ -564,6 +565,18 @@ static void notrace inc_misses_counter(struct bpf_prog *prog)
>> u64 notrace __bpf_prog_enter(struct bpf_prog *prog)
>> __acquires(RCU)
>> {
> preempt_disable_notrace();
>
>> +#ifdef CONFIG_PERF_EVENTS
>> + /* Calling migrate_disable costs two entries in the LBR. To save
>> + * some entries, we call perf_snapshot_branch_stack before
>> + * migrate_disable to save some entries. This is OK because we
>> + * care about the branch trace before entering the BPF program.
>> + * If migrate happens exactly here, there isn't much we can do to
>> + * preserve the data.
>> + */
>> + if (prog->call_get_branch)
>> + static_call(perf_snapshot_branch_stack)(
>> + this_cpu_ptr(&bpf_perf_branch_snapshot));
>
> Here the comment is accurate, but if you recall the calling context
> requirements of perf_snapshot_branch_stack from the last patch, you'll
> see it requires you have at the very least preemption disabled, which
> you just violated.

>
> I think you'll find that (on x86 at least) the suggested
> preempt_disable_notrace() incurs no additional branches.
>
> Still, there is the next point to consider...
>
>> +#endif
>> rcu_read_lock();
>> migrate_disable();
>
> preempt_enable_notrace();

Do we want preempt_enable_notrace() after migrate_disable()? It feels a
little weird to me.

>
>> if (unlikely(__this_cpu_inc_return(*(prog->active)) != 1)) {
>
>> @@ -1863,9 +1892,23 @@ void bpf_put_raw_tracepoint(struct bpf_raw_event_map *btp)
>> preempt_enable();
>> }
>>
>> +DEFINE_PER_CPU(struct perf_branch_snapshot, bpf_perf_branch_snapshot);
>> +
>> static __always_inline
>> void __bpf_trace_run(struct bpf_prog *prog, u64 *args)
>> {
>> +#ifdef CONFIG_PERF_EVENTS
>> + /* Calling migrate_disable costs two entries in the LBR. To save
>> + * some entries, we call perf_snapshot_branch_stack before
>> + * migrate_disable to save some entries. This is OK because we
>> + * care about the branch trace before entering the BPF program.
>> + * If migrate happens exactly here, there isn't much we can do to
>> + * preserve the data.
>> + */
>> + if (prog->call_get_branch)
>> + static_call(perf_snapshot_branch_stack)(
>> + this_cpu_ptr(&bpf_perf_branch_snapshot));
>> +#endif
>> cant_sleep();
>
> In the face of ^^^^^^ the comment makes no sense. Still, what are the
> nesting rules for __bpf_trace_run() and __bpf_prog_enter() ? I'm
> thinking the trace one can nest inside an occurence of prog, at which
> point you have pieces.

I think broken LBR records is something we cannot really avoid in case
of nesting. OTOH, these should be rare cases and will not hurt the results
in most the use cases.

I should probably tighten the rules in verifier to only apply it for
__bpf_prog_enter (where we have the primary use case). We can enable it
for other program types when there are other use cases.

Thanks,
Song