Re: [linux-audio-dev] Re: [announce] [patch] Voluntary Kernel PreemptionPatch

From: Benno Senoner
Date: Wed Jul 14 2004 - 09:37:10 EST


Takashi Iwai wrote:

Is it possible that I am simply pushing my hardware past its limits? Keep in mind this is a 600Mhz C3 processor.



I think yes. 32 frames / 44.1kHz = 0.725 ms.


I don't think so, I think it's because the Linux scheduler (and kernel in general) since it's not a RTOS
is pushed to the limits. (but as we see it can still be optimized).

For example I used the same VIA box with RTLinux to remote control servo motors which need a PWM signal
of the duration of 2msec and based on the location of the negative flank (from high to low) the servo
motor goes in a certain position.
For example if the duration of the pulse is 2msec then setting the flank at 0msec (at the beginning of the cycle)
the servo goes to -45degrees , 1msec 0degrees , 2msec +45degrees.

Jitter in the pulse can be detected when the servo is vibrating a bit around the nominal position.
Of course a very short lived spike cannot be detected by the servo because of the motor's inertia
but I tried to put the box under very high load especially video playback (the VIA box uses a shared bus architecture
holding the video data in the PC's RAM), HD load etc but the jitter is very minimal, probably 30-40usec because the
servos vibrate only about 1degree or so (only when the box is under very high load).

This just to say that the VIA box should easily be able to cope with 32 audio frames (0.6msec buffers) from a hardware
point of view.
Anyway "worst case" latencies (or better latencies under very high load) of <0.5-1msec are completely adequate for
real time multimedia because if you shorten your audio processing cycles too much (eg 32 frames) then the setup
overhead of DSP routines, and scheduling overhead becomes big and the efficiency of
the algorithms decrease quite a bit.
imagine running jack with 10 client applications and 32 frames (0.6msec periods) , this means that within 0.6msec you need
to reschedule 10 times = 60 usec per client. I don't know how much the actual rescheduling of a process takes (AFAIK around 1usec ?)
but I guess the main problem is the constant invalidation of the cache because those audio clients run for a really short time.
Of course if you have only 1-2 clients running then the efficiency at 32 frames should still be good (but it will certainly provide 10-20% less
perfromance than using 64 or 128 frames).
Our goal should be rock solid operation at 64msec (around 1msec-1.5msec processing periods).
If you use 2-3 buffers then the kernel has still another 1.5-3msec of headroom before an actual (audible) xrun occurs.

cheers,
Benno
http://www.linuxsampler.org



Takashi




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/