Re: [PATCH v7 1/4] spinlock: A new lockref structure for locklessupdate of refcount

From: Sedat Dilek
Date: Sun Sep 01 2013 - 11:45:38 EST


On Sun, Sep 1, 2013 at 5:32 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Sun, Sep 1, 2013 at 3:01 AM, Sedat Dilek <sedat.dilek@xxxxxxxxx> wrote:
>>
>> Looks like this is now 10x faster: ~2.66Mloops (debug) VS.
>> ~26.60Mloops (no-debug).
>
> Ok, that's getting to be in the right ballpark.
>
> But your profile is still odd.
>
>> Samples: 159K of event 'cycles:pp', Event count (approx.): 76968896763
>> 12,79% t_lockref_from- [kernel.kallsyms] [k] irq_return
>> 4,36% t_lockref_from- [kernel.kallsyms] [k] __ticket_spin_lock
>
> If you do the profile with "-g", what are the top callers of this? You
> shouldn't see any spinlock load from the path lookup, but you have all
> these other things going on..
>

$ sudo ~/src/linux-kernel/linux/tools/perf/perf record -g -e cycles:pp
./scripts/t_lockref_from-linus
Total loops: 26205085
[ perf record: Woken up 77 times to write data ]
[ perf record: Captured and wrote 19.778 MB perf.data (~864092 samples) ]


$ sudo ~/src/linux-kernel/linux/tools/perf/perf report <--- I used
here with -f option, the last one, dropped here.

Samples: 160K of event 'cycles:pp', Event count (approx.): 77003901089
+ 12,46% t_lockref_from- [kernel.kallsyms] [k] irq_return
+ 4,86% t_lockref_from- [kernel.kallsyms] [k] lockref_get_or_lock
+ 4,42% t_lockref_from- [kernel.kallsyms] [k] __ticket_spin_lock
+ 4,28% t_lockref_from- [kernel.kallsyms] [k] __acct_update_integrals
+ 3,97% t_lockref_from- [kernel.kallsyms] [k] user_exit
+ 3,04% t_lockref_from- [kernel.kallsyms] [k] local_clock
+ 2,71% t_lockref_from- [kernel.kallsyms] [k] kmem_cache_alloc
+ 2,50% t_lockref_from- [kernel.kallsyms] [k] link_path_walk
+ 2,46% t_lockref_from- libc-2.15.so [.] __xstat64
+ 2,38% t_lockref_from- [kernel.kallsyms] [k] kmem_cache_free
+ 1,96% t_lockref_from- [kernel.kallsyms] [k] path_lookupat
+ 1,88% t_lockref_from- [kernel.kallsyms] [k] __d_lookup_rcu
+ 1,87% t_lockref_from- [kernel.kallsyms] [k] tracesys
+ 1,84% t_lockref_from- [kernel.kallsyms] [k]
rcu_eqs_exit_common.isra.43
+ 1,81% t_lockref_from- [kernel.kallsyms] [k]
rcu_eqs_enter_common.isra.45
+ 1,80% t_lockref_from- [kernel.kallsyms] [k] user_enter
+ 1,79% t_lockref_from- [kernel.kallsyms] [k] sched_clock_cpu
+ 1,61% t_lockref_from- [kernel.kallsyms] [k] native_read_tsc
+ 1,56% t_lockref_from- [kernel.kallsyms] [k] cp_new_stat
+ 1,52% t_lockref_from- [kernel.kallsyms] [k] lockref_put_or_lock
+ 1,51% t_lockref_from- [kernel.kallsyms] [k] account_system_time
+ 1,46% t_lockref_from- [kernel.kallsyms] [k] path_init
+ 1,46% t_lockref_from- [kernel.kallsyms] [k] copy_user_generic_unrolled
+ 1,42% t_lockref_from- [kernel.kallsyms] [k] syscall_trace_enter
+ 1,38% t_lockref_from- [kernel.kallsyms] [k] jiffies_to_timeval
+ 1,32% t_lockref_from- [kernel.kallsyms] [k] lookup_fast
+ 1,31% t_lockref_from- [kernel.kallsyms] [k] native_sched_clock
+ 1,24% t_lockref_from- [kernel.kallsyms] [k] getname_flags
+ 1,17% t_lockref_from- [kernel.kallsyms] [k] vfs_getattr
+ 1,15% t_lockref_from- [kernel.kallsyms] [k] get_vtime_delta
+ 1,03% t_lockref_from- [kernel.kallsyms] [k] syscall_trace_leave
+ 0,95% t_lockref_from- [kernel.kallsyms] [k] generic_fillattr
+ 0,94% t_lockref_from- [kernel.kallsyms] [k] user_path_at_empty
+ 0,93% t_lockref_from- [kernel.kallsyms] [k] system_call_after_swapgs
+ 0,93% t_lockref_from- [kernel.kallsyms] [k] account_user_time
+ 0,89% t_lockref_from- [kernel.kallsyms] [k] strncpy_from_user
+ 0,86% t_lockref_from- [kernel.kallsyms] [k] complete_walk
+ 0,80% t_lockref_from- [kernel.kallsyms] [k] filename_lookup
+ 0,80% t_lockref_from- [kernel.kallsyms] [k] vfs_fstatat
+ 0,78% t_lockref_from- [kernel.kallsyms] [k] generic_permission
+ 0,77% t_lockref_from- [kernel.kallsyms] [k] __ticket_spin_unlock
+ 0,73% t_lockref_from- [kernel.kallsyms] [k] __inode_permission
+ 0,69% t_lockref_from- [kernel.kallsyms] [k] vtime_account_user
+ 0,66% t_lockref_from- [kernel.kallsyms] [k] d_rcu_to_refcount
+ 0,61% t_lockref_from- [kernel.kallsyms] [k] common_perm
+ 0,60% t_lockref_from- [kernel.kallsyms] [k] rcu_eqs_enter
+ 0,59% t_lockref_from- [kernel.kallsyms] [k] dput
+ 0,54% t_lockref_from- [kernel.kallsyms] [k] vtime_user_enter
+ 0,51% t_lockref_from- [kernel.kallsyms] [k] cpuacct_account_field
+ 0,50% t_lockref_from- [kernel.kallsyms] [k] mntput
+ 0,48% t_lockref_from- [kernel.kallsyms] [k] lg_local_lock
+ 0,48% t_lockref_from- [kernel.kallsyms] [k] apparmor_inode_getattr
+ 0,45% t_lockref_from- t_lockref_from-linus [.] start_routine
+ 0,45% t_lockref_from- [kernel.kallsyms] [k] __vtime_account_system
Press '?' for help on key bindings

- Sedat -

>> 4,36% t_lockref_from- [kernel.kallsyms] [k] __acct_update_integrals
>> 4,07% t_lockref_from- [kernel.kallsyms] [k] user_exit
>> 3,12% t_lockref_from- [kernel.kallsyms] [k] local_clock
>> 2,83% t_lockref_from- [kernel.kallsyms] [k] lockref_get_or_lock
>> 2,73% t_lockref_from- [kernel.kallsyms] [k] kmem_cache_alloc
>> 2,62% t_lockref_from- [kernel.kallsyms] [k] __d_lookup_rcu
>
> You're spending more time on the task stats than on the actual lookup.
> Maybe you should turn off CONFIG_TASKSTATS..But why that whole
> irq_return thing? Odd.
>

Yes, I have CONFIG_TASKSTATS=y.
I can try a -4 build w/o it.

- Sedat -
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/