Re: [patch] lockless, scalable get_pid(), for_each_process() elimination, 2.5.35-BK

From: William Lee Irwin III (wli@holomorphy.com)
Date: Tue Sep 17 2002 - 19:22:40 EST


On Wed, Sep 18, 2002 at 01:06:11AM +0200, Ingo Molnar wrote:
> then i found one of wli's older patches for 2.5.23 [grr, it was not
> announced anywhere, i almost started coding the same problem], which
> provides the right (and much harder to implement) approach: it cleans up
> PID-space allocation to provide a generic hash for PIDs, session IDs,
> process group IDs and TGIDs, properly allocated and freed. This approach,
> besides paving the way for a scalable and time-bounded get_pid()
> implementation, also got rid of roughly half of for_each_process()
> (do_each_thread()) iterations done in the kernel, which alone is worth the
> effort. Now we can cleanly iterate through all processes in a session
> group or process group.
> i took the patch, adopted it to the recent ->ptrace_children and threading
> related changes, fixed a couple of bugs and made it work. It really worked
> well, nice work William!

Thank you for taking up the completion of development on and maintenance
of this patch. I have not had the time resources to advance it myself,
though now with your help I would be glad to contribute to the effort.
If you would like to assume ownership, I'd be glad to hand it over, and
send patches removing additional instances of for_each_process() to you
as I find the time.

On Wed, Sep 18, 2002 at 01:06:11AM +0200, Ingo Molnar wrote:
> I also wrote a new alloc_pid()/free_pid() implementation from scratch,
> which provides lockless, time-bounded PID allocation. This new PID
> allocator has a worst-case latency of 10 microseconds on a cache-cold P4,
> the cache-hot worst-case latency is 2 usecs, if pid_max is set to 1
> million.

This is necessary regardless in order to address the larger PID space.
In the brief review of it I've done thus far, the new algorithm appears
to be both sound and efficient.

The issues addressed here are extremely important for the workloads I
must support. The algorithmic breakdown of the prior algorithms with
larger task counts was severe enough to trip the NMI oopser and deceives
users into power cycling when they are not running the NMI oopser. This
issue in fact obstructs my testing and development on other subsystems.

Thanks,
Bill
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Sep 23 2002 - 22:00:21 EST