Re: FYI: bezerko mouse is back

Joe (josepha48@yahoo.com)
Tue, 3 Aug 1999 12:12:09 -0700 (PDT)


> > > I have too taken a look at the X sources. There is in fact
> a problem
> > > there. It depends a lot on the hardware, but many boards
> suffer from
> > > this problem. It is in fact a usable work around to switch
> to XT-PIC.
> > > It is possible to have a discussion about who is wrong,
> but it's
> > > certainly the combination that gives trouble.
> > I figured there was a problem there too.. so I sent email to
> X
> > but never heard a word from them...
> Hmm. Where did you send it? To XFree86 mailing lists?

yes to the mailing list and no reply, other than there auto
response..

> > > As far as I know, there's _no_ problem using just the
> console,
> > maybe with the timewarp, but not the mouse..
> Even the time warp does not appear to occur on my machine
> without X.
> Strange. Only in X I get time warps. But that's just _my_ 4
> machines
> probably (though 4 machines may be a lot).

I have only had problems in X, that is what I ment. I have not
had any console problems, but I do not use the console that much
(if ever lately) I use xterms && vi or write a simple gui hack
in gtk or tk

> > > As long as this is not solved, I would advice people with
> > > production
> > > machines to switch to XT-PIC for the mouse interrupt.
> > via the patch that you did? did you post this somewhere
> outside
> > of this list?
> It's an option to use my patch, and no, I haven't post this
> patch somewhere
> else. If people want, it may be posted everywhere. Other
> option is of course
> to use noapic.

since it is only one interrupt on my machine so far I see not
reason to punish them all by using noapic<g>

> The strangest thing I noticed (just checked with double
> trouble machine
> here), is that the problem seems to be triggered by X only.
> Hmm. Thinking:

isn't X the the only program that access hardware 'directly' as
opposed to being handled by the kernel? ie scsi cards, hd's, etc
are all handled by the kernel, but X access the video card
directly? I do know that it is handled differently than most
other hardware, but am not sure exactly how.

> X uses a device to listen to. That's quite normal. The device
> is used only
> by X. This device is opened with "open(mouse->mseDevice,
> O_RDWR | O_NDELAY)".
> It would be interesting what would happen if I open the mouse
> device this way,
> and if I will _force_ time warps. The program that reads this
> device should do
> simple parsing to show what's going on. As my keyboard goes
> crazy too normally,
> it should be an easy trigger. If I can trigger it outside X,
> then there's
> _much_ less overhead involved.

and that would rule out an X problem too.

> Report tomorrow (I hope).
> BTW: It does _not_ need to be the interrupt handling, it can
> be the device
> handling in general that cannot stand time warps.

I did look at some of the kernel code arch/i386/kernel/time.c

the code in there had comments that said that it should be
cleaned up so I am not sure if it ever was. It had mentioned how
the clock was getting the time and seemed to hint that it was a
hack and not a good way of doing things.

> > > My first thought when I heard this problem was a race
> > > condition somewhere.
> > I found one with X and the xfs..
> And that one is?

basically starting my X server I still get an _X11 Trans
Socket or something, I'd have to look at the xserver.log to get
the exact message but it had "Could not connect to x server"
then X starts. It seems that X 'hickups' then starts or actually
that when x tries the first time to start it cannot and then the
second try it does, although I am not sure why.
Actually this was a RH6.0 problem that they posted updated X
rpms for. It originally caused X to crash, but they fixed it. I
am not sure it was a problem with X and xfs, per say, just some
compile option or something that RH had done when they built X
for RH6.0. I cannot remeber all the details but I am sure it is
still there at their bugzilla site.

If you'd like to look at my X log I can send it to you so you
can see the exact message.

> > > This would explain behaviour. But how do you explain that
> my time may
> > > warp, but my mouse does not give trouble using XT-PIC?
> (This is an open
> > > question, not retorical)
> > my guess would be because timing is more important in an SMP
> > machine, especially where interrupts are distibuted and
> heavily
> > used.
> I would think that more in the direction that 2 processors are
> doing
> two related things. One changes for example time and confused
> the other
> running driver (mouse device)

again timing.. one cpu is out of sync with the other.
Maybe it has something to do with the fact that when you are
using XT-PIC the CPU is handling an interrupt like it was a UP
machine.
On an SMP machine when that warp occurs you are throwing off
your timing.
->Think of a motorcycle my harley. There is a timing system.
When that timing system is off the motorcycle runs funny and
sounds funny, but when the timing system is correct it runs
smooth. This is especially true in the case of a harley. In a
harley the pistons do not fire at 180 degrees of each other but
rather one right after each other boom boom. That is what gives
them there unique noise. If the timing is off just slightly then
the Harley does not run properly, and you can tell
->In a SMP system the interrupts are fielded one after another
boom boom. (Assuming that the apic handles 1 interrupt at a
time) If the timing is off you get boom..nothing..boom. misfire?

now in X the mouse is the active process doing a mouse move.
Yes/No?

However when one of these cpus is off the interrupt may get
'delayed' and the other cpus gets executed. misfire, if the
timing stays off you may have continuous misfireing or bezerko
mouse.

I this scenerio any delays in an split interrupt can put
movements of the mouse out of wack. If the timing stays off or
jumps around then interrupts can be out of the order. (yes no?)

I had originally heard some people say that they had there mouse
come back after 10 minutes or so, if they left there system
alone. Maybe the timing got out of wack and was able to correct
itself.

I have not had keyboard trouble like you have. Just the mouse.

Also last night I noticed some strange behavior while running
kde. upon logging out of kde (if you have not used kde then you
go to menu then logout) the mouse freezes on the left side of my
screen, and then as the window manager starts to shutdown it
brings the logout prompt window in the center of the screen, and
restores mouse control. I guess this would mean that X and the
wm control the mouse? No other wm I use does that thou.

> Think of it this way: The exact value of the timer is only
> known
> to the kernel and the OS. Not to the hardware. So the
> interrupt

but at some point the kernel passes this off to the hardware and
says do this. if there is a delay in doing this and the other
cpu gets to do that then?

> controller(s) don't know about time, only about sequence of
> commands
> and interrupts. Therefore the timing will only impact the
> delays done
> on base of the this.

hmm maybe if the timing problem can be corrected and done right

> So if time warps are the source of the trouble, there has to
> be a part
> of the kernel that can't handle it (or a strange hack in
> XFree86 that
> can't handle it).

Or both.
I thought that NFS mounts were having a similar problem which
were interrupt related?

> But first let's see if this is directly causing troubles in
> the first
> place.

I am going to throw some printk's out there and turn on debug,
and see what happens.

Joe
_____________________________________________________________
Do You Yahoo!?
Free instant messaging and more at http://messenger.yahoo.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/