Linux on E10K; UPDATE!!

From: Craig Armour (c.armour@uq.net.au)
Date: Sat May 20 2000 - 21:18:13 EST


Hi again.

for those that are interested

we bought a copy of redhat... we blugged our trust sun bootable cdrom
into the chassis... and it didn't boot.

the first error was not being able to find the sun serial port which is
fair enough... E10k's don't have one.

so, we installed RH6.2 on and ultra5, then plugged the disk into the
E10k. Recompiled kernel on ultra5, then... well... you get the picture.

As we started to hack kernel code... we noticed a general trend.

The stuff for linux to run on the E10k appears to be all mostly there.
the problem seems to be that stuff has been plugged in to the source
after the starfire stuff was written, but not checking to see if that
actually breaks the starfire code... and example?

go to ./linux/arch/sparc64/mm/init.c

on line 1343?? ( 2.2.14)

#ifdef CONFIG_SUN_SERIAL
        /* This does not logically belong here, but is the first place
           we can initialize it at, so that we work in the PAGE_OFFSET+
           address space. */
        mempool = sun_serial_setup(mempool);
#endif

then... there is a bunch of starfire detection stuff.

the problem is this... if you define CONFIG_SUN_SERIAL as yes in the
config, it'll obviously include the above code. A starfire doesn't have
any serial ports (unless you have an io board or something). so it
dies... however, the detection stuff afterwards, is there solely to tell
the kernel not to use ther serial stuff since starfires don't have any.
now an obvious solution would be to not define CONFIG_SUN_SERIAL. we
chose though to move the above block of code after the starfire
detection stuff.

In a fair few of the source files, the same problem exists, where stuff
has been included to starfire detection, but other code has just been
plonked in that practially makes the starfire detection stuff useless.
I don't think this has been done on purpose, more, no one has starfires
doing nothing to test *every* code release on to see if it breaks it.

Starfires are VERY different machines to any other E class sun server.
They are the only machines that don't ship with a serial port as default
among other things. ( Starfires don't have monitors or keyboards
destroying most console options ). We thought at first that RH6.2
wouldn't boot due to the kernel not haveing prom console compiled in...
but on later inspection... it did by default... just everthing is
borked.

I don't know about 2.3.99 as we didn't have time today to keep hacking.
As it is, it appears our window for 64 processor E10k will dissappear
very shortly. but we still have 40cpus to play with at the moment so...
we'll keep trying untill we get sick of it and/or the project which we
origionally bought these machines for actually get's nearer completion.

What needs to happen:

the sparc64 code needs to be 'cleaned up'. as I said, we havn't looked
at the 2.3.99 stuff so this may very well be the case. I'm willing to
guess that most of the problems would just require a cut and paste... or
a slight rewrite at worst. all the E10k code seems to be there, it's
just been nullafied by some porely placed ... bit's.

perhaps someone could check out the version from when the starfire stuff
was origionally put in and do a diff on the current version??

this may or may not be a real issue to anyone. To us... linux on a
starfire is a pure interest thing. I don't think anyone in their right
mind would actually put one into production. And the starfire being so
different to anyother E class machine, and there only being a hand full
around the world on which anyone would even let you go near them with a
linux cd, it may simply be too difficult to keep a track of the relevant
code within the source tree

anyway... I'm not on this list... so any comments to me personally
please. any flames to /dev/null etc...

we know this has been done before... we have an email from someone in
germany saying as such... but that was in dec '98 so...

anyway... just for those who are interested.

Cheers
Craig

Unix Systems Administrator
Information Technology Services
University of Queensland
comments are my own and not of my employer

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue May 23 2000 - 21:00:18 EST