Re: Timings for optimised poll(2)

Richard Gooch (rgooch@atnf.CSIRO.AU)
Wed, 27 Aug 1997 08:55:02 +1000


Rogier Wolff writes:
> Richard Gooch wrote:
> >
> > As Dean has already pointed out, if you want to check for activity on
> > a few fds which all have large numbers, then select(2) has to scan
> > through a large number of bits. So in this case poll(2) has less work
> > to do.
> >
> > Example timings (Pentium 100):
> > 1021 file descriptors (3-1023), check for activity on 1014-1023
> >
> > Using a select(2) loop like this:
> > memcpy (&i_fds, &input_fds, sizeof i_fds);
> > memcpy (&o_fds, &output_fds, sizeof i_fds);
> > memcpy (&e_fds, &exception_fds, sizeof i_fds);
> > nready = select (max_fd + 1, &i_fds, &o_fds, &e_fds, &tv);
> > takes 362 microseconds.
> >
> > Using a poll(2) loop like this:
> > nready = poll (pollfd_array + start_index, num_to_poll, 0);
> > takes 93 microseconds.
> >
> > Now, if we change the test to do checks on descriptors 924-1023:
> > select(2) takes 1274 microseconds
> > poll(2) takes 1023 microseconds.
> >
> > And now if we check descriptors 24-1023:
> > select(2) takes 3897 microseconds
> > poll(2) takes 4221 microseconds.
>
> Richard,
>
> It could very well be that poll has adavantages over select. However
> if you want to introduce a non-standard API into the kernel, you
> should think about it very well before just jumping in and doing it.
> (Well, what you do privately on your computer doesn't really matter).

Hence my internal optimisations are my first priority. I have no
intention of submitting a patch for poll2(2) before the major
bottlenecks are removed. For now I'm just testing poll2(2) for my own
interest, to see how it improves as these other issues are fixed.

> In this case you claimed that "select" showed a bottleneck somewhere.
> You supported your claim by submitting an awfully inefficient piece of
> code.
>
> In general you've already made up your mind before you start
> benchmarking, and it is hard to make the enemy look good by writing
> good code for the "enemy".
>
> Now it seems that you are finally running the right benchmarks, and
> that is a good thing.

That sounds a tad defensive. Pardon me for evolving my benchmarks as I
investigate where all the different bottlenecks are (and remove them).

> Doing the right thing for the wrong reasons is "bad".

My reasons are simple: I want to speed up polling under Linux. My
benchmarks are needed so I can evaluate how my internal changes
improve things, as well as comparing with poll2(2) which I believe has
certain benefits (and if I'm wrong, I'll still post the benchmarks,
once they're all done).

> If you really are doing the right benchmarks and those indicate that
> your new approach is promising, then I say you should go for it.

I sent my code: any suggestions for further optimisations? I'll roll
in application code which actually processes the results from each of
the three syscalls (to give a true picture of total load) once I've
finished the internal optimisations.

> P.S. For sparse bitvectors the select code should probably scan the
> in, out and ex set separately, and use "find_first_zero" like calls.

You mean the kernel code, right? Unfortunately, how do you know that
the bitvectors are sparse? Apply some kind of test (analyse the
fd_sets) to find out? You'd want to be careful that the test doesn't
take too much time.
I suppose you could just do "find_first_bit" to skip to the start of
the set bits, and from then on just continue as normal (rather than
using "find_next_bit"). That would definately help in those cases
where you are checking for activity on descriptors 900-1000.
Right now it would be murder to separate the in, out and ex scans, as
it would mean you have to call the indirect poll functions 3 times as
often as now. That will make select(2) three times slower.
Once the "poll_events" field is available, this triple scanning won't
be so painful. Although you would end up skipping through memory three
times.
Hm. Perhaps a simple and cheap alternative is to use find_first_bit on
each fd_set, then take the minimum index of the three, and start
normal scanning from there. Of course, it won't help in the case where
you are selecting on descriptors 100,200,300,400,500,600,700,800,900
or some other such sparse set.

Regards,

Richard....