Re: matrox+history+xoff=fbcon/linux crash?

From: Petr Vandrovec (VANDROVE@vc.cvut.cz)
Date: Mon May 29 2000 - 04:46:54 EST


Alan Cox wrote:
> Benson Chow wrote:
> > If I cause a lot of scrolling, like ls -alR / and then shift-pageup for
> > virtual console scrolling, and combine it with a few control-S's, I can
> > reliably get my computer's console to completely crash. No more keyboard
> > responses are accepted. Any ideas what's going on, or is it bad hardware?
>
> That sounds like a race in the frame buffer code.

Hi Alan, Hi Benson,
   sorry for late reply, but I was not with Internet during weekend...
   
   I was under impression that now all codepaths to console use
spin_lock_irq(&console_lock), so that no-one can cause fbdev reentering...
So there are two possible sources of problem:

  1) some code path misses this locking or
  2) someone inside console_lock tried to do printk()
  3) there is problem with softback code
  
   If you'll boot with
   
video=matrox:fastfont:40000

   then system will be less vulnerable to problem ad 1 - this switches
matroxfb from doing ILOAD (which cannot be interrupted with another
accelerated activity) to doing BITBLT (which is almost atomic), so even
on reenter you'll get only garbled one character on screen instead of
total lockup. (with ILOAD you'll get one garbled character AND total
lockup because of commands were interpreted as character body and
character body is interpreted as commands...)

   If it is second problem, then you should look whether your kernel
produces tons of messages under normal load... But as console_lock
is spin_lock_irq(), only printk()'s inside console/fbcon/fbdev could
cause this problem. You can verify it if you remove all
spin_lock_irqsave(&console_lock, ...); and
spin_unlock_irqrestore(&console_lock, ...) from linux/kernel/printk.c...
If it will print any message or oops instead of deadlock, you are
on right track...

   If it is problem with soft-scrollback, you can disable it with
kernel parameter 'video=scrollback:0'. I think that all bugs were
squashed out of scrollback code during 2.3.4x, but ... Also, if it
is problem with reentering, using 'video=scrollback:0' can make
problem less frequent, as scrollback code is not simplest one and
for sure is not reentrant (but peoples with reentering scrollback
code reports strange character/attributes pairs on screen and not
complete lockups).

   And last possibility, edit linux/drivers/video/matrox/matroxfb_base.h
and replace
#undef MATROXFB_USE_SPINLOCKS
with
#define MATROXFB_USE_SPINLOCKS (1)
   It should also fix problem ad 1, but lock.accel should be
set only if console_lock is already held... Maybe if you define
CRITBEGIN (in file above) as
   if (spin_trylock(&console_lock)) {
       spin_unlock(&console_lock);
       BUG();
   }
you'll get some nice stack trace... (if my idea about meaning of
console_lock is correct)
                                Best regards,
                                        Petr Vandrovec
                                        vandrove@vc.cvut.cz
                                        

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed May 31 2000 - 21:00:20 EST