Re: real-time threaded IO with low latency (audio)

David Olofson (audiality@swipnet.se)
Mon, 26 Jul 1999 04:31:43 +0200


Ove Ewerlid wrote:
>
> David Olofson wrote:
> > Not being too serious there, I didn't mention that I had only single CPU
> > systems in mind. I wouldn't call a single CPU locked to a single process
> > one running Linux...
>
> The locked process still takes signals and faitfully exits when you kill
> it!
> Locked yes, but certainly killable and you do get that segfault instead
> of system lookup ...

Yeah, I know, but as long as it's not allowing other tasks to use any
remaining CPU time, it's not the kind of nice, balanced and general
solution I like. However, as it first of all works, and second, is the
only way to do it for very low latencies because of cache problems, it's
not all that bad after all. And certainly a lot nicer than a DSP on a
PCI board! :-)

> > > The worst case _conditionals_ are usually not a problem.
> >
> > No, not on a system that's _synchronized_ as opposed to a DSP loop
> > padded with NOPs...
>
> A DSP loop on a PII never need to be padded with NOP's due to the rdtsc!

No, but it won't be too much out-of-phase if it should spend the "wrong"
number of turns in the sync loop either. The original discussion was
whether it's possible to do signal processing using exactly 100% of the
available CPU time, which is not what you're doing of you have to insert
some NOPs or busywait on a hardware counter to stay in sync. ;-)

Not too relevant; It all started with an answer to a comment on my all
too theoretical explanation of why interrupt/scheduling jitter limits
the ammount of CPU time you can use for processing in a very low latency
system.

If anyone cares, here's another version;

1. You get an interrupt, telling you the ADC
has a new buffer ready for you.
2. You start processing...
3. ...and since you're loading the CPU to 99.99%,
you JUST manage to finish your processing before...
4. The DAC starts reading your output data
5. Goto 1

(Note: What happens if you have to call 20 plug-in callbacks before the
output buffer is ready for playback? THAT's the real problem here, as
the last change of the first sample must be done before the DAC snatches
it.)

Is it obvious enough what happens if the interrupt should be a little
late...?

> > > Linux can deal with this scenario just fine as long as it has CPU
> > > locking.
> >
> > I'd say that's more "working around Linux" than "Linux dealing with it",
> > but so what; it works! :-)
>
> Linux has full control over the locked process and it is completly
> sandboxed!

Yes, and that's very nice, compared to RTL's lack of protected memory.
(Which I or someone else will have to fix ASAP; or at least before
anthing with more complex RTL tasks goes to end users.)

("Working around" is still regarding this "not allowing remaining cycles
to be used" thing.)

> > Much like a DSP card, but with a lot faster "card<->host" communication,
> > a niced development environment, lower price, more power and no weird
> > hardware that's hard to get at. :-)
>
> Jupp!
>
> > Yep, but if I can do with the jitter caused by cache misses and the
> > latency lower limit implied by context switching, I think using, say 80%
> > of BOTH CPUs is pretty interesting too. And that's good enough for most
>
> This is what I'm saying!
> (In my case the non locked CPU executes Matlab code in your case it
> executes audiality ...)

Yeah, but maybe I should at that I want around 10 ms latency on that
extra 80%. And it's still, by no means, soft real time. But this may be
possible to do in user space sooner than we expected! :-)

> > kinds of music/audio processing, which is what I'm most interested in.
> > However, locking one CPU for ultra low latency processing, and using 50%
> > of the other one for RTL at higher latency could be pretty
> > interesting...
>
> It is!

Well, I have a dual P-II 233 lying around here... :-) *getting more and
more curious about what it could do if it got processor locking in
addition to RTL*

Besides, it could take two hacked Celeron 466'es, which would be very
cool, as the FULL CPU CLOCK (as opposed to the P-II/III L2 cache) 128 kB
cache fits LOTS of intermediate buffers for plug-ins with these kind of
latencies. :-D

> > Agree, and I have to admit that this discussion made me realize that my
> > old ideas of single sample input->output latency may not be completely
> > impossible, at least not on SMP systems. By compiling plug-ins
> > "statements" into a tight loop executed on the locked CPU when the user
> > changes the signal routing, this kind of latency may even be possible to
> > use in a quite transparent way.
>
> Nice idea!
> It is a pitty that modern sound cards and modern busses are optimized
> for
> bandwith rather then latency. Even though the CPU can operate on a
> per sample basis the sound card may deliver data in bursts of 64 samples
> ...

Yep. *es1370..* What about external S/PDIF ADACs? How many samples of
buffering? IIRC, S/PDIF is a pretty low level biphase modulated version
of the serial streams used by most converters. Hacking or getting an
S/PDIF card these days shouldn't be all that hard, unless it requires
special solutions to get rid of any buffering beyond the shift
registers...

Or are there still (16 bit or better) converters with parallel
interfaces around? Haven't seen any with beter than 14 linearity and
about 90 dB SNR, and that's not really what I'd like to spend hours of
designing and soldering for. The latency would be interesting, though.

However I do have one (software) hack to try with the es1370! Not that
it's very useful for pro audio, but it's not that much work trying this.
Hint: Have you seen the programing specs for the es1370? There's a pause
mode that holds the last sample, while the in-chip memory is addressable
in the same way as the registers. (The registers are using a part of
that memory instead of the good ol' "register latches" solution.) Anyone
tried this? (Ok, I know it's pointless without some very hard RT
solution, but... That's what we have here!)

> > My hihest priority is end users being able to get their audio
> > recording/editing/processing work done as reliably on a PC as on
> > dedicated hardware, and less than 1 ms of latency isn't really
> > necessary. But why stop at that? What's pointless overkill today is an
> > expected feature tomorrow...
>
> Exactly :-)
>
> Ove
>
> PS; I'm talking about matlab above, it is a great tool but I do not
> like the monopoly like situation Mathworks have ... but they
> support Linux quite nicely!

Monopoly situations are always scary. :-( Not very good for anyone but
the monopoly holder, as most of us know all too well... But, if it runs
on Linux, at least it might be useful! ;-)

> --
> Ove Ewerlid
> Ove.Ewerlid@syscon.uu.se or Ove.Ewerlid@signal.uu.se
> Phone: +46 70 666 23 63, Fax: +46 18 503 611, +46 18 555 096

//David

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/