Sixty seconds of poking around answers the problem I think:
We have one set of calls (the fast inline ones) and two sets of uses of
it - the slow ioctl paths and the fast I/O paths. I suspect a lot of the
problem will take a walk if all the little used ioctl() paths used a
slow_copy_from_user() etc
A lot of the ioctl calls could be tidied up to use some form of generic
ioctl handler (look at BSD and learn for once ;)). Even tho we often don't
do size encoding a short routine to handle the parameter stuff would be
much cleaner for most ioctl calls.
Alan