Re: setup() and odd Syscalls in Ancient History

From: Linus Torvalds
Date: Mon Sep 21 2015 - 14:28:46 EST


On Mon, Sep 21, 2015 at 6:07 AM, Aleksa Sarai <cyphar@xxxxxxxxxx> wrote:
>
> I was wondering if you could explain *why* setup() was a syscall in
> early Linux? I understand that it did some ... odd things (one
> function both freeing the initial memory and setting up the
> filesystems, devices and mounting) which you obviously need to do in
> init. But from what I can see (after digging out v0.01 from the tomb),
> it was *never* used by userspace, which begs the question: why was it
> a syscall in the first place?

Heh. Interesting question, and I have to admit I went and looked at
the code to remind me what was going on.

It's not really obvious, because the code process separation memory
management in very early Linux was based on segmentation. Yes, it used
paging too, but it originally used one single page table with 64
chunks of 64MB each (if I remember correctly), and then segments would
be used to make each process see a single 64MB slice of the 4GB
address space.

So the code actually goes into user space, but the very *initial* user
space is actually shared with the kernel (until the first fork()). We
do the initial user mode trasnition by just switching to user
segments.

So in init/main.c, the magic is that

move_to_user_mode();
if (!fork()) { /* we count on this going ok */
init();
}
for(;;) pause();

where that "move_to_user_mode()" will reload all the segments (some by
hand, but CS/SS by doing an "iret"). So that first fork() will
actually be done in user space, and before that happens the kernel
cannot sleep (because there is no idle task).

That "for (;;) pause()" after the fork() is the idle task, which
allows the "init()" code to sleep.

So "setup()" is a system call because it needs to sleep (to do the
IO), and the kernel couldn't sleep before it got to that user-mode and
first fork thing.

Could it have been done differently? Sure. Obviously we don't do it
that way any more, and we create the idle tasks separately and not
with "fork()" any more. But it kind of made sense at the time.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/