Re: [PATCH 0/3 v3] dcache: make it more scalable on large system

From: Waiman Long
Date: Wed May 29 2013 - 16:23:18 EST


On 05/29/2013 12:13 PM, Andi Kleen wrote:
The d_path() is called by perf_event_mmap_event() which translates
VMA to its file path for memory segments backed by files. As perf is
not just for sampling data within the kernel, it can also be used
for checking access pattern in the user space. As a result, it needs
to map VMAs back to the backing files to access their symbols
information. If d_path() is not the right function to call for this
purpose, what other alternatives do we have?
In principle it should be only called for new file mappings
getting maped. Do you really have that many new file mappings all
the time? Or is this related to program startup?

The AIM7 benchmark that I used runs a large number of relatively short jobs. I think each time a new job is spawned, the file mappngs have to be redone again. It is probably not a big problem for long running processes.

My patch set consists of 2 different changes. The first one is to
avoid taking the d_lock lock when updating the reference count in
the dentries. This particular change also benefit some other
workloads that are filesystem intensive. One particular example is
the short workload in the AIM7 benchmark. One of the job type in the
short workload is "misc_rtns_1" which calls security functions like
getpwnam(), getpwuid(), getgrgid() a couple of times. These
functions open the /etc/passwd or /etc/group files, read their
content and close the files. It is the intensive open/read/close
sequence from multiple threads that is causing 80%+ contention in
the d_lock on a system with large number of cores. The MIT's
MOSBench paper also outlined dentry reference counting as a
The paper was before Nick Piggin's RCU (and our) work on this.
Modern kernels do not have dcache problems with mosbench, unless
you run weird security modules like SMACK that effectively
disable dcache RCU.

I had tried, but not yet able to run the MOSBench myself. Thank for letting me know that the dcache problem wrt MOSBench was fixed.

BTW lock elision may fix these problems anyways, in a much
simpler way.

I will certainly hope so. However, there will still be a lot of computers out there running pre-Haswell Intel chips. For them, locking is still a problem that need to be solved.

Regards,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/