Re: Linux Graphics Architecture (format fixed)

Jim Gettys (jg@pa.dec.com)
Mon, 8 Feb 1999 10:17:33 -0800


> Sender: owner-linux-kernel@vger.rutgers.edu
> From: alan@lxorguk.ukuu.org.uk (Alan Cox)
> Date: Mon, 8 Feb 1999 15:55:48 +0000 (GMT)
> To: bennyb@ntplx.net (Ben Bridgwater)
> Cc: linux-kernel@vger.rutgers.edu
> Subject: Re: Linux Graphics Architecture (format fixed)
> -----
> > What about a simple example where you set write a bit plane mask into a
> graphics
> > register, then write some data to the framebuffer? You can't have two
> programs doing
> > that at the same time!
>
> That depends if the registers can be read back and saved.
>
> Alan
>

You are right on, Alan...

Turns out that there is a substantial amount of graphics hardware out
there that have write only registers (or at least this was true when I
was still in the X business). This makes coordination between multiple
writers a challenge, to say the least. The variation among graphics hardware
is tremendous; somehow, a system that doesn't span the hardware decently
isn't going to fly, IMHO.

In general, with some exceptions, people haven't beaten me up about X
performance, or at least not those WHO HAVE PROPERLY OPTIMIZED THEIR DDX
CODE, or naively think they should be able to write code exactly like
some other window system. Generally, for most applications, X has not
been the bottleneck on hardware, and people doing well optimized
implementations have been able to get the full hardware performance. I've
certainly been beaten up by people using badly optimized X servers
(or braindead hardware which can't be made to go fast), but I discount
those people.

The exceptions are when what they need to do isn't graceful given X; here,
my answer has always been "X is, and has always intended to be extended,
and many extensions have been done". For example, enlightenment currently
implements "translucent windows" while moving windows by doing
alpha-blending; to begin with, this boggles my mind, Rasterman is quite
a hacker. He says that Xie (the X imaging extension) doesn't do this (or
he couldn't figure out how to do this; I never tried to understand Xie
so I don't know if Rasterman is right). We also botched wide lines and
arcs pretty badly (I plead that the paper by Hobby hadn't been published
yet on that one), and should have included splines in the core X protocol
(we didn't have a spline expert available at the right moment; I didn't
know Lyle Ramshaw until 6 months too late).

By using shared memory pixmaps (part of just about every X server I see
these days), he's able to do this in client code in real time. It is taking
exactly one memory copy more than it could (for a software implementation,
anyway; 3d hardware often has alpha-blending in hardware, and OpenGL
does that very nicely, as far as I know), and in my mind, is a candidate
for an X extension.

UNFORTUNATELY, there is alot of XFree86 code that is far from optimal;
Xig graphics makes a living on this phenomena. If you want to compare
unoptimal XFree86 code with some optimized other graphics code for a
particular piece of hardware, you will get the expected result. I wish
people would help the XFree86 folks; they have just about a sisyphian
task supporting all the PC graphics hardware... It would not surprise
me if the total of X ddx (driver, for those not familiar with X nomenclature)
code surpassed the total of Linux device drivers.

I've also seen half baked other attempts at doing a replacement for X.
For example, those using CORBA for transport. A basic CORBA IIOP request
is approximately 200 bytes; the smallest X request is 4 bytes, with
reasonably realistic calls 20-40 bytes. I won't even go into the
relative performance in terms of operations/second between an X server
and a CORBA ORB.... Shall we just say the fastest ORB in the world is
orders of magnitude slower than the fastest X server in the world,
in terms of operations/second (this is mostly, but not entirely,
an implementation artifact; most CORBA systems have very slow marshalling,
and the size of the requests accounts for some of the difference, but
far from all). Understand that the FIRST performance enhancement done
that caused X to become X (we started with code called W that Paul Asente
had done), was to move to a streaming protocol, rather than an RPC
protocol. And my experience is that we didn't even go far enough in
X11 in this direction (InternAtom is synchronous, and has been a bottleneck).

History also showed that X's wire encoding wasn't compact enough for
some environments, and LBX had to be done. How well is a CORBA based
solution likely to work?

And yes, you can build a higher level of abstraction into a server,
so long as you buy into the implementer's religion; this mitigates
somewhat the previous paragraph's objections, if you buy into
the people's religion.

So far, I've not seen people with a realistic understanding of the problems
working on a realistic replacement for X, or understanding that even for
a single vendor, there may be MANY X drivers, much less the multiplication
across the number of vendors of graphics hardware. The amount of code
rapidly becomes mind-boggling.

So when I start hearing of people with realistic ideas, I'd be happy to
vent my spleen on what is really wrong with X, so they don't repeat our
mistakes. (To first order, they are in the "Why X is Not Our Favorite
Window System" article of some years back). So far, I haven't seen anyone
with realistic ideas. But then again, you are free to wave my opinion
away as "biased" if you like.

- Jim Gettys

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/