[PATCH][RFC] x86 multiple user-mode privilege rings

From: Luca Barbieri (ldb@ldb.ods.org)
Date: Sat Oct 26 2002 - 21:48:13 EST


Short explaination:
This patch implements a feature called "x86 multiring", which is a
shorthand for x86 multiple user-mode privilege rings support.
It allows user-mode programs to create DPL 1 and 2 segments and get a
modifiable per-process copy of IDT.

User Mode Linux can use these features to implement a syscall mechanism
identical to the one used by the kernel-mode kernel, and thus much
faster than the current one, with free memory protection and with zero
context switches.

Wine could also use it to achieve fast syscall-level emulation of
Windows NT (and, to a lesser extent, Windows 3.1 and 9x).

Obviously there is some risk of the patch creating security holes.

System calls:
All operations are performed using the new sys_multiring syscall. The
API is documented in include/asm-i386/multiring.h, that multiring
applications should include.

Supervisor problems:
The most serious issue caused by the use of ring 1 and 2 is that they
are intended for kernel code, which means that they count as supervisor
wherever a "user/supervisor" bit is present.

This results in:
- Unavailaibility of multiring on 386 processors since they don't
support supervisor WP.

- Page protection no longer working
While this may seem catastrophic, it isn't because segment-level
protection can be used instead.
To enforce it, the patch modifies the GDT so that the default CS and DS
have a limit at __PAGE_OFFSET - 1.
LDT and TLS interfaces are also changed to alter segment limits to avoid
overlap with the kernel area. If this is impossible, multiring mode is
inhibited using bad_segments mm_context_t field; if the process is
already in multiring mode the operation fails.
vsyscalls will need to be put before the rest of the kernel with this
scheme.

- Supervisor bit in error code in page faults not reliable (regs->xcs is
used instead, with the only problem of not being able to tell f00f
invalid opcodes from page faults)

- Potential minor problems for profilers since to get DPL 1/2 events,
DPL 0 events also have to be enabled.

Based on my reading of Intel manuals, there should not be any other
problem, but I might have missed something.

IDT functionality:
When multiring mode is entered, the default IDT is copied to a new
allocated page and a pointer to the new one is stored in mm_context_t.
The initial multiring IDT is identical to the default one, with the
exception of the SYSCALL_VECTOR DPL which is set to 1.
The code includes a config option to put IDTs in high memory, that is
however untested and not very useful anyway.

IDTs are loaded in two different ways: if CONFIG_X86_HIGHIDT is set or
the processor is f00f-buggy, each CPU gets a fixmap entry that is
remapped to load another IDT; otherwise, a simple lidt instruction is
used.

sys_multiring allows to read, copy and set gates in the IDT.
The vectors that are settable are currently 0x20-0x2f (because DOS and
Windows are here), 0x80 and 0xf1-0xfa.
Set operations will fail if the user tries to set a gate to a kernel
mode address which isn't the syscall one or a task or interrupt gate.
The i8259 is remapped to 0x30-0x3f to accomodate this.

The multiring_mode filed is added to the thread structure, and is 1 if
the thread has entered multiring mode (i.e. selectors are RPL 1-ed).

GDT functionality:
When switching to a multiring mm, the DPL in the default user CS and DS
is set to 1, to prevent ring 2 and 3 to load them and thus bypass any
security that the DPL 1 code might be enforcing.

LDT/TLS functionality:
When in multiring mode, LDT/TLS functions honor the new dpl field in
struct user_desc that of course allows to set a custom dpl in the
descriptors.
They are also changed to support segment-level protection as outlined
above.

TSS functionality:
sys_multiring allows to modify the ring 1 and 2 TSS ESP and SS that are
loaded on inter-privilege call.
The values are kept in the thread structure and are loaded on task
switch.

clone functionality:
The CLONE_IDT flag is added, and does the obvious thing. Note that if
the task is not in multiring mode, it is silently ignored.
The CLONE_CLEAR_IDT flag is also added and also does the obvious thing
and takes precedence over CLONE_IDT.

Entering multiring mode:
Multiring mode can be entered using sys_multiring(MULTIRING_ELEVATE).
This will allocate a new IDT, fix the GDT and put RPL 1 selectors in
cs/ds/es/ss/fs/gs. Note that all other threads will also get RPL 1
selectors.
RPL 1 selectors are loaded only if the selector points to the default CS
or DS.

Limitations:
- Since there are only 4 privilege rings only up to 2 UMLs can be nested
- Doesn't work on 386

Potential improvements:
- Copy on write IDTs
- Reduction of IDT allocation size from 4 KB to 2 KB
- x86-64?

Diffstat:
 arch/i386/Config.help | 7
 arch/i386/config.in | 1
 arch/i386/kernel/Makefile | 2
 arch/i386/kernel/cpu/common.c | 9
 arch/i386/kernel/cpu/intel.c | 8
 arch/i386/kernel/entry.S | 1
 arch/i386/kernel/i8259.c | 6
 arch/i386/kernel/ldt.c | 88 +++++++++
 arch/i386/kernel/multiring.c | 316 +++++++++++++++++++++++++++++++++++
 arch/i386/kernel/process.c | 52 ++++-
 arch/i386/kernel/ptrace.c | 8
 arch/i386/kernel/signal.c | 24 +-
 arch/i386/kernel/traps.c | 44 +++-
 arch/i386/mach-generic/irq_vectors.h | 24 +-
 arch/i386/mach-visws/irq_vectors.h | 22 +-
 arch/i386/math-emu/fpu_entry.c | 2
 arch/i386/mm/fault.c | 5
 fs/exec.c | 4
 include/asm-i386/desc.h | 48 +++++
 include/asm-i386/fixmap.h | 14 +
 include/asm-i386/idt.h | 236 ++++++++++++++++++++++++++
 include/asm-i386/ldt.h | 3
 include/asm-i386/mmu.h | 10 +
 include/asm-i386/mmu_context.h | 8
 include/asm-i386/multiring.h | 194 +++++++++++++++++++++
 include/asm-i386/processor.h | 26 +-
 include/asm-i386/segment.h | 12 +
 include/asm-i386/system.h | 6
 include/asm-i386/unistd.h | 1
 include/linux/sched.h | 8
 kernel/fork.c | 6
 31 files changed, 1103 insertions(+), 92 deletions(-)

Test program (apologies for the horrible coding style):

/*
  multiring-test.c: example program for Linux multiring support
  
  Copyright (C) 2002 Luca Barbieri <ldb@ldb.ods.org>

  This program is free software; you can redistribute it and/or modify it under
  the terms of the GNU General Public License as published by the Free
  Software Foundation; either version 2, or (at your option) any later
  version.

  This program is distributed in the hope that it will be useful, but WITHOUT ANY
  WARRANTY; without even the implied warranty of MERCHANTABILITY or
  FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
  for more details.

  You should have received a copy of the GNU General Public License
  along with this program; see the file COPYING. If not, write to the Free
  Software Foundation, 59 Temple Place - Suite 330, Boston, MA
  02111-1307, USA.
*/

#include <errno.h>
#include "/home/ldb/src/linux-2.5.44_multiring/include/asm-i386/unistd.h"
#include "/home/ldb/src/linux-2.5.44_multiring/include/asm-i386/multiring.h"
#include "/home/ldb/src/linux-2.5.44_multiring/include/asm-i386/ldt.h"
#include <signal.h>
#include <stdlib.h>
#include <stdio.h>
#include <sched.h>
#define CLONE_IDT 0x00800000

unsigned ring;
unsigned xcs;

typedef char intbuf_t[4];

intbuf_t test_entry_intbuf;
typedef int (*test_entry_int_t)(unsigned param) __attribute__((regparm(3)));
#define test_entry_int ((test_entry_int_t)test_entry_intbuf)

intbuf_t kernel_syscall_intbuf;

char lower_ring_msg[] = "from lower ring! [should be XXXm...]\n";

void make_int(intbuf_t p, unsigned vec)
{
        p[0] = 0xcd;
        p[1] = vec;
        p[2] = 0xc3;
}

void test_entry();
void test_entry2();
void syscall_entry();

int kernel_syscall(unsigned num, unsigned arg1, unsigned arg2, unsigned arg3)
{
        unsigned ret;
        __asm__ __volatile__ (
        "pushl %%ebx\n\t"
        "movl %2, %%ebx\n\t"
        "call kernel_syscall_intbuf\n\t"
        "popl %%ebx" : "=a" (ret) : "0" (num), "r" (arg1), "c" (arg2), "d" (arg3));
        return ret;
}

int do_test_entry(unsigned param)
{
        printf("test_entry called with: %u\n", param);
        if(param == 99)
        {
                unsigned vec;
                int ret;
                int testvec;
                vec = multiring_copy_free(0x80);
                printf("copy_free(0x80) ret: %i errno: %i\n", vec, errno);

                printf("alloc_all ");
                for(;;)
                {
                        ret = multiring_copy_free(0x80);
                        if(ret >= 0)
                                printf("%x ", testvec = ret);
                        else
                        {
                                printf("\nret: %i errno: %i [should be -1/28]\n", ret, errno);
                                break;
                        }
                }
                ret = multiring_free(testvec);
                printf("free testvec: %i ret: %i errno: %i\n", testvec, ret, errno);
                ret = multiring_copy_free(0x80);
                printf("copy_free ret: %i errno: %i\n", ret, errno);
                
                printf("about to multiring_set %x %x\n", (xcs << 16) | ((unsigned long)test_entry2 & 0xffff), ((unsigned long)test_entry2 & 0xffff0000) | 0x8f00 | ((ring + 1) << 13));
                multiring_set(0x80, (xcs << 16) | ((unsigned long)syscall_entry & 0xffff), ((unsigned long)syscall_entry & 0xffff0000) | 0x8f00 | ((ring + 1) << 13));
                make_int(kernel_syscall_intbuf, vec);
        }
        return param * 2;
}

int do_syscall_entry(unsigned num, unsigned arg1, unsigned arg2, unsigned arg3)
{
        if((num == __NR_write) && (arg1 == 1))
        {
                char* ptr = (char*)arg2;
                ptr[0] = 'X';
                ptr[1] = 'X';
                ptr[2] = 'X';
                return kernel_syscall(num, 1, arg2, arg3);
        }
        else if(num == __NR_exit)
        {
                kernel_syscall(__NR_write, 1, (unsigned long)"quitting\n", strlen("quitting\n"));
                return kernel_syscall(__NR_exit, 0, 0, 0);
        }
        else
        {
                kernel_syscall(__NR_write, 1, (unsigned long)"syscall not allowed\n", strlen("syscall not allowed\n"));
                return -ENOSYS;
        }
}

asm("
test_entry:
pushl %eax
call do_test_entry
addl $4, %esp
iret

test_entry2:
cld
pushl %es
pushl %ds
pushl %eax
movw %ss, %ax
movw %ax, %ds
movw %ax, %es
call do_test_entry
addl $4, %esp
popl %ds
popl %es
iret

syscall_entry:
cld
pushl %es
pushl %ds
pushl %ebp
pushl %edi
pushl %esi
pushl %edx
pushl %ecx
pushl %ebx
pushl %eax
movw %ss, %ax
movw %ax, %ds
movw %ax, %es

call do_syscall_entry

addl $4, %esp
popl %ebx
popl %ecx
popl %edx
popl %esi
popl %edi
popl %ebp
popl %ds
popl %es
iret

");

char* entry_stack;
char* lower_stack;
char* clone_stack;

#define ldt_sel(num, ring) (((num) << 3) | (1 << 2) | (ring))

static void interprivilege_jump(unsigned ss, unsigned esp, unsigned cs, unsigned eip) __attribute__((noreturn));

static void interprivilege_jump(unsigned ss, unsigned esp, unsigned cs, unsigned eip)
{
        __asm__ __volatile__(
        "movl %0, %%ds\n\t"
        "movl %0, %%es\n\t"
        "pushl %0\n\t"
        "pushl %1\n\t"
        "pushfl\n\t"
        "pushl %2\n\t"
        "pushl %3\n\t"
        "iret" : : "r" (ss), "r" (esp), "r" (cs), "r" (eip));
        abort();
}

void lower_ring_start()
{
        test_entry_int(10);
#ifdef CRASH_INT
        /* int 0x80 has DPL 1, so this should crash */
        write(1, lower_ring_msg, strlen(lower_ring_msg));
#endif
        test_entry_int(99);
        write(1, lower_ring_msg, strlen(lower_ring_msg));
#ifdef CRASH_SEGV
        *(unsigned long*)0 = 0;
#endif
        _exit(0);
}

/* this should reboot the processor if we have access to kernel mode memory */
void triple_fault(void)
{
        struct
        {
                unsigned short a;
                struct
                {
                        unsigned short lim;
                        unsigned long addr;
                } m48;
        } desc;
        /* find the IDT address */
        __asm__ __volatile__("sidt %0" : "=m" (desc.m48));

        /* clear the IDT (or crash, if the kernel works properly) */
        memset((void*)desc.m48.addr, 0, desc.m48.lim + 1);

        /* triple fault */
        *(unsigned long*)0 = 0;
}

#define idt_gate(sel, addr, ring) (((sel) << 16) | ((unsigned long)addr & 0xffff)), (((unsigned long)addr & 0xffff0000) | 0x8f00 | ((ring) << 13))
static inline pid_t syscall_clone(unsigned long flags)
{
        pid_t pid;
        asm volatile("pushl %%ebx\n\tmovl %1, %%ebx\n\tmovl %%esp, %%ecx\n\tint $0x80\n\tpopl %%ebx" : "=a" (pid) : "r" (flags | SIGCHLD), "0" (__NR_clone) : "edx", "memory");
        return pid;
}

unsigned test_vec;

void clone_vm_child(void)
{
        int ret;
        unsigned ccs;
        unsigned css;
        unsigned cring;
        struct multiring_gate gate;
        ret = multiring_check();
        printf("vm check ret: %i errno: %i\n", ret, errno);
        
        __asm__ __volatile__("movl %%cs, %0" : "=r" (ccs));
        __asm__ __volatile__("movl %%ss, %0" : "=r" (css));
        printf("vm cs: %x ss: %x\n", ccs, css);
        cring = xcs & 3;

        ret = multiring_set(test_vec, idt_gate(ccs, 0x33333333, cring));
        printf("vm set 33333333 ret: %i errno: %i\n", ret, errno);
        ret = multiring_get(test_vec, &gate);
        printf("vm get ret: %i errno: %i a: %x b: %x\n", ret, errno, gate.a, gate.b);
                
        _exit(0);
}

int main(int argc, char** argv)
{
        int ret;
        unsigned xss;
        unsigned test_entry_vec;
        struct multiring_gate gate;
        pid_t pid;
        struct user_desc ldt;
        ret = multiring_elevate();
        printf("elevate ret: %i errno: %i\n", ret, errno);
#ifdef ELEVATE
        return 0;
#endif
        __asm__ __volatile__("movl %%cs, %0" : "=r" (xcs));

        ret = multiring_check();
        printf("check ret: %i errno: %i\n", ret, errno);
        
        printf("cs: %x\n", xcs);
        __asm__ __volatile__("movl %%ss, %0" : "=r" (xss));
        printf("ss: %x\n", xss);
        ring = xcs & 3;
        if(ring == 3)
        {
                printf("we are still at ring 3 :( - exiting\n");
                return 0;
        }

        test_vec = multiring_set_free(idt_gate(xcs, 0x11111111, ring));
        printf("set_free 11111111 ret: %i errno: %i\n", test_vec, errno);
        ret = multiring_get(test_vec, &gate);
        printf("get ret: %i errno: %i a: %x b: %x\n", ret, errno, gate.a, gate.b);
        pid = fork();
        printf("fork pid: %i errno: %i\n", pid, errno);
        if(!pid)
        {
                //for(;;) {}
                ret = multiring_check();
                printf("fork check ret: %i errno: %i\n", ret, errno);

                __asm__ __volatile__("movl %%cs, %0" : "=r" (xcs));
                __asm__ __volatile__("movl %%ss, %0" : "=r" (xss));
                printf("fork cs: %x ss: %x\n", xcs, xss);
                ring = xcs & 3;

                ret = multiring_set(test_vec, idt_gate(xcs, 0x22222222, ring));
                printf("fork set 22222222 ret: %i errno: %i\n", ret, errno);
                ret = multiring_get(test_vec, &gate);
                printf("fork get ret: %i errno: %i a: %x b: %x\n", ret, errno, gate.a, gate.b);
                
                _exit(0);
        }
        else
        {
                int status;
                waitpid(pid, &status, 0);
                printf("status: %x\n", status);
        }
        
        ret = multiring_get(test_vec, &gate);
        printf("get ret: %i errno: %i a: %x b: %x [should be desc for 11111111]\n", ret, errno, gate.a, gate.b);

        pid = clone(clone_vm_child, (char*)malloc(65536) + 65536, CLONE_VM|SIGCHLD, 0);
        printf("CLONE_VM pid: %i errno: %i\n", pid, errno);
        {
                int status;
                waitpid(pid, &status, 0);
                printf("status: %x\n", status);
        }
        
        ret = multiring_get(test_vec, &gate);
        printf("get ret: %i errno: %i a: %x b: %x [should be desc for 33333333]\n", ret, errno, gate.a, gate.b);

        pid = syscall_clone(CLONE_IDT);
        printf("CLONE_IDT pid: %i errno: %i\n", pid, errno);
        if(!pid)
        {
                //for(;;) {}
                ret = multiring_check();
                printf("idt check ret: %i errno: %i\n", ret, errno);
        
                __asm__ __volatile__("movl %%cs, %0" : "=r" (xcs));
                __asm__ __volatile__("movl %%ss, %0" : "=r" (xss));
                printf("idt cs: %x ss: %x\n", xcs, xss);
                ring = xcs & 3;

                ret = multiring_set(test_vec, idt_gate(xcs, 0x44444444, ring));
                printf("idt set 44444444 ret: %i errno: %i\n", ret, errno);
                ret = multiring_get(test_vec, &gate);
                printf("idt get ret: %i errno: %i a: %x b: %x\n", ret, errno, gate.a, gate.b);
                
                _exit(0);
        }
        else
        {
                int status;
                waitpid(pid, &status, 0);
                printf("status: %x\n", status);
        }

        ret = multiring_get(test_vec, &gate);
        printf("get ret: %i errno: %i a: %x b: %x [should be desc for 44444444]\n", ret, errno, gate.a, gate.b);

#ifdef DO_EXEC
        execlp("ls", "ls", 0);
#endif
        
#ifdef EXPLOIT_GDT
        printf("about to exploit with sel = %x\n", xss);
        triple_fault();
#endif

        multiring_set_free(idt_gate(xcs, 0xeeeeeeee, 0));
        printf("multiring_set(gate_to_kernel) ret: %i errno: %i\n", ret, errno);

#ifdef EXPLOIT_IDT
        test_entry_vec = multiring_set_free(idt_gate(xcs, 0xeeeeeeee, 1));
        printf("set_free(to_karea) ret: %i errno: %i\n", test_entry_vec, errno);
#else
        test_entry_vec = multiring_set_free(idt_gate(xcs, test_entry, ring));
        printf("set_free ret: %i errno: %i\n", test_entry_vec, errno);
#endif

        make_int(test_entry_intbuf, test_entry_vec);
        ret = test_entry_int(42);
        printf("test_entry_int: %u\n", ret);

        /* we could use the initial stack, but it's more difficult */
        entry_stack = malloc(65536);

        ret = multiring_set_espss(ring, (unsigned long)(entry_stack + 65536), xss);
        printf("set_espss ret: %i errno %i esp: %x ss: %x\n", ret, errno, (unsigned long)(entry_stack + 65536), xss);
        {
                unsigned tesp, tss;
                ret = multiring_get_espss(ring, &tesp, &tss);
                printf("get_espss ret: %i errno %i esp: %x ss: %x\n", ret, errno, tesp, tss);
        }
        ret = multiring_set(test_entry_vec, idt_gate(xcs, test_entry2, ring + 1));
        printf("multiring_set(test_entry_vec) ret: %i errno: %i\n", ret, errno);

        /* addresses should not be hardcoded in real programs */
        
        ldt.entry_number = 1;
        ldt.base_addr = 0;
        ldt.limit = 0xaffff;
        ldt.seg_32bit = 1;
        ldt.contents = MODIFY_LDT_CONTENTS_DATA;
        ldt.read_exec_only = 0;
        ldt.limit_in_pages = 1;
        ldt.seg_not_present = 0;
        ldt.useable = 0;
#ifdef EXPLOIT_LDT
        ldt.dpl = ring;
        ldt.limit = 0xfffff;
        ret = modify_ldt(0x11, &ldt, sizeof(struct user_desc));
        printf("modify_ldt(exploit) ret: %i errno: %i\n", test_entry_vec, errno);

        __asm__ __volatile__("movl %0, %%ds\n\tmovl %0, %%es" : : "r" (ldt_sel(1, ring)));
        printf("about to exploit with sel = %x\n", ldt_sel(1, ring));

        triple_fault();
#endif
        ldt.dpl = ring + 1;
        ret = modify_ldt(0x11, &ldt, sizeof(struct user_desc));
        printf("modify_ldt(data) ret: %i errno: %i\n", test_entry_vec, errno);

        ldt.entry_number = 2;
        ldt.contents = MODIFY_LDT_CONTENTS_CODE;
        modify_ldt(0x11, &ldt, sizeof(struct user_desc));
        printf("modify_ldt(code) ret: %i errno: %i\n", test_entry_vec, errno);

        lower_stack = malloc(65536);

        interprivilege_jump(ldt_sel(1, ring + 1), (unsigned long)(lower_stack + 65536), ldt_sel(2, ring + 1), (unsigned long)lower_ring_start);
        return 0;
}

Patch:

diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/arch/i386/Config.help linux-2.5.44_multiring/arch/i386/Config.help
--- linux-2.5.44/arch/i386/Config.help 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/arch/i386/Config.help 2002-10-27 00:44:02.000000000 +0200
@@ -165,6 +165,13 @@ CONFIG_HIGHPTE
   low memory. Setting this option will put user-space page table
   entries in high memory.
 
+CONFIG_HIGHIDT
+ The kernel uses 4KB for each multiring process (e.g. User Mode Linux).
+ Say Y to allocate those 4KB in high memory. This is only useful if you
+ plan to run hundreds of thousands of multiring processes.
+
+ If unsure, say N.
+
 CONFIG_HIGHMEM4G
   Select this if you have a 32-bit processor and between 1 and 4
   gigabytes of physical RAM.
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/arch/i386/config.in linux-2.5.44_multiring/arch/i386/config.in
--- linux-2.5.44/arch/i386/config.in 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/arch/i386/config.in 2002-10-27 00:44:02.000000000 +0200
@@ -236,6 +236,7 @@ fi
 
 if [ "$CONFIG_HIGHMEM4G" = "y" -o "$CONFIG_HIGHMEM64G" = "y" ]; then
    bool 'Allocate 3rd-level pagetables from highmem' CONFIG_HIGHPTE
+ bool 'Allocate multiring IDTs from highmem (EXPERIMENTAL)' CONFIG_X86_HIGHIDT
 fi
 
 bool 'Math emulation' CONFIG_MATH_EMULATION
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/arch/i386/kernel/cpu/common.c linux-2.5.44_multiring/arch/i386/kernel/cpu/common.c
--- linux-2.5.44/arch/i386/kernel/cpu/common.c 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/arch/i386/kernel/cpu/common.c 2002-10-27 00:44:02.000000000 +0200
@@ -243,6 +243,8 @@ void __init generic_identify(struct cpui
         }
 }
 
+extern void load_idt_table_init(unsigned cpu, pgprot_t prot);
+
 /*
  * This does the hard work of actually picking apart the CPU stuff...
  */
@@ -257,6 +259,7 @@ void __init identify_cpu(struct cpuinfo_
         c->x86_model = c->x86_mask = 0; /* So far unknown... */
         c->x86_vendor_id[0] = '\0'; /* Unset */
         c->x86_model_id[0] = '\0'; /* Unset */
+ c->f00f_bug = 0;
         memset(&c->x86_capability, 0, sizeof c->x86_capability);
 
         if (!have_cpuid_p()) {
@@ -348,6 +351,9 @@ void __init identify_cpu(struct cpuinfo_
                 /* AND the already accumulated flags with these */
                 for ( i = 0 ; i < NCAPINTS ; i++ )
                         boot_cpu_data.x86_capability[i] &= c->x86_capability[i];
+
+ if(c->f00f_bug)
+ boot_cpu_data.f00f_bug = 1;
         }
 
         printk(KERN_DEBUG "CPU: Common caps: %08lx %08lx %08lx %08lx\n",
@@ -355,6 +361,9 @@ void __init identify_cpu(struct cpuinfo_
                boot_cpu_data.x86_capability[1],
                boot_cpu_data.x86_capability[2],
                boot_cpu_data.x86_capability[3]);
+
+ /* load the IDT */
+ load_idt_table_init((c == &boot_cpu_data) ? 0 : (c - cpu_data), boot_cpu_data.f00f_bug ? PAGE_KERNEL_RO : PAGE_KERNEL);
 }
 /*
  * Perform early boot up checks for a valid TSC. See arch/i386/kernel/time.c
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/arch/i386/kernel/cpu/intel.c linux-2.5.44_multiring/arch/i386/kernel/cpu/intel.c
--- linux-2.5.44/arch/i386/kernel/cpu/intel.c 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/arch/i386/kernel/cpu/intel.c 2002-10-27 00:44:02.000000000 +0200
@@ -169,15 +169,13 @@ static void __init init_intel(struct cpu
          * have the F0 0F bug, which lets nonpriviledged users lock up the system.
          * Note that the workaround only should be initialized once...
          */
- c->f00f_bug = 0;
         if ( c->x86 == 5 ) {
- static int f00f_workaround_enabled = 0;
+ static int f00f_workaround_message = 0;
 
                 c->f00f_bug = 1;
- if ( !f00f_workaround_enabled ) {
- trap_init_f00f_bug();
+ if ( !f00f_workaround_message ) {
                         printk(KERN_NOTICE "Intel Pentium with F0 0F bug - workaround enabled.\n");
- f00f_workaround_enabled = 1;
+ f00f_workaround_message = 1;
                 }
         }
 #endif
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/arch/i386/kernel/entry.S linux-2.5.44_multiring/arch/i386/kernel/entry.S
--- linux-2.5.44/arch/i386/kernel/entry.S 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/arch/i386/kernel/entry.S 2002-10-27 00:44:02.000000000 +0200
@@ -737,6 +737,7 @@ ENTRY(sys_call_table)
         .long sys_free_hugepages
         .long sys_exit_group
         .long sys_lookup_dcookie
+ .long sys_multiring
 
         .rept NR_syscalls-(.-sys_call_table)/4
                 .long sys_ni_syscall
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/arch/i386/kernel/i8259.c linux-2.5.44_multiring/arch/i386/kernel/i8259.c
--- linux-2.5.44/arch/i386/kernel/i8259.c 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/arch/i386/kernel/i8259.c 2002-10-27 00:44:02.000000000 +0200
@@ -282,7 +282,8 @@ void init_8259A(int auto_eoi)
          * outb_p - this has to work on a wide range of PC hardware.
          */
         outb_p(0x11, 0x20); /* ICW1: select 8259A-1 init */
- outb_p(0x20 + 0, 0x21); /* ICW2: 8259A-1 IR0-7 mapped to 0x20-0x27 */
+ /* ICW2: 8259A-1 IR0-7 mapped to FIRST_EXTERNAL_VECTOR-FIRST_EXTERNAL_VECTOR+7 */
+ outb_p(FIRST_EXTERNAL_VECTOR + 0, 0x21);
         outb_p(0x04, 0x21); /* 8259A-1 (the master) has a slave on IR2 */
         if (auto_eoi)
                 outb_p(0x03, 0x21); /* master does Auto EOI */
@@ -290,7 +291,8 @@ void init_8259A(int auto_eoi)
                 outb_p(0x01, 0x21); /* master expects normal EOI */
 
         outb_p(0x11, 0xA0); /* ICW1: select 8259A-2 init */
- outb_p(0x20 + 8, 0xA1); /* ICW2: 8259A-2 IR0-7 mapped to 0x28-0x2f */
+ /* ICW2: 8259A-2 IR0-7 mapped to FIRST_EXTERNAL_VECTOR+8-FIRST_EXTERNAL_VECTOR+f */
+ outb_p(FIRST_EXTERNAL_VECTOR + 8, 0xA1);
         outb_p(0x02, 0xA1); /* 8259A-2 is a slave on master's IR2 */
         outb_p(0x01, 0xA1); /* (slave's support for AEOI in flat mode
                                     is to be investigated) */
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/arch/i386/kernel/ldt.c linux-2.5.44_multiring/arch/i386/kernel/ldt.c
--- linux-2.5.44/arch/i386/kernel/ldt.c 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/arch/i386/kernel/ldt.c 2002-10-27 02:18:48.000000000 +0200
@@ -18,6 +18,7 @@
 #include <asm/system.h>
 #include <asm/ldt.h>
 #include <asm/desc.h>
+#include <asm/idt.h>
 
 #ifdef CONFIG_SMP /* avoids "defined but not used" warnig */
 static void flush_ldt(void *null)
@@ -81,22 +82,62 @@ static inline int copy_ldt(mm_context_t
         return 0;
 }
 
+static inline int copy_idt(mm_context_t* new, mm_context_t* old, unsigned flags)
+{
+ struct desc_struct* oldidt;
+ if(flags & (CLONE_VM | CLONE_IDT))
+ {
+ oldidt = kmap_idt(old);
+ atomic_inc(idt_refcnt(oldidt));
+ kunmap_idt(old, oldidt);
+ new->idt = old->idt;
+ }
+ else
+ {
+ struct desc_struct* newidt;
+ union idt idtu;
+
+ newidt = alloc_idt(&idtu);
+ if(!newidt)
+ return -ENOMEM;
+ oldidt = kmap_read_idt(old);
+ memcpy(newidt, oldidt, IDT_SIZE);
+ kunmap_read_idt(old, oldidt);
+ kunmap_new_idt(&idtu, newidt);
+ wmb();
+ new->idt = idtu;
+ }
+ return 0;
+}
+
 /*
  * we do not have to muck with descriptors here, that is
  * done in switch_mm() as needed.
  */
-int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
+int init_new_context_flags(struct task_struct *tsk, struct mm_struct *mm, unsigned flags)
 {
         struct mm_struct * old_mm;
         int retval = 0;
 
         init_MUTEX(&mm->context.sem);
         mm->context.size = 0;
+ mm->context.idt.opaque = 0;
+ mm->context.bad_segments = 0;
         old_mm = current->mm;
- if (old_mm && old_mm->context.size > 0) {
- down(&old_mm->context.sem);
- retval = copy_ldt(&mm->context, &old_mm->context);
- up(&old_mm->context.sem);
+ if (old_mm)
+ {
+ mm->context.bad_segments = old_mm->context.bad_segments;
+ if(old_mm->context.size > 0 || (old_mm->context.idt.opaque && !(flags & CLONE_CLEAR_IDT)))
+ {
+ down(&old_mm->context.sem);
+ mm->context.bad_segments = old_mm->context.bad_segments;
+ retval = 0;
+ if(old_mm->context.size > 0)
+ retval = copy_ldt(&mm->context, &old_mm->context);
+ if(!retval && old_mm->context.idt.opaque && !(flags & CLONE_CLEAR_IDT))
+ retval = copy_idt(&mm->context, &old_mm->context, flags);
+ up(&old_mm->context.sem);
+ }
         }
         return retval;
 }
@@ -115,6 +156,8 @@ void release_segments(struct mm_struct *
                         kfree(mm->context.ldt);
                 mm->context.size = 0;
         }
+ if(mm->context.idt.opaque)
+ free_idt(&mm->context);
 }
 
 static int read_ldt(void * ptr, unsigned long bytecount)
@@ -189,6 +232,10 @@ static int write_ldt(void * ptr, unsigne
                         goto out;
         }
 
+ error = LDT_handle_perm(&ldt_info, &mm->context);
+ if(error)
+ goto out;
+
         down(&mm->context.sem);
         if (ldt_info.entry_number >= mm->context.size) {
                 error = alloc_ldt(&current->mm->context, ldt_info.entry_number+1, 1);
@@ -211,10 +258,22 @@ static int write_ldt(void * ptr, unsigne
         entry_2 = LDT_entry_b(&ldt_info);
         if (oldmode)
                 entry_2 &= ~(1 << 20);
+ if(mm->context.idt.opaque)
+ {
+ error = -EINVAL;
+ if(ldt_info.dpl == 0)
+ goto out_unlock;
+ entry_2 = (entry_2 & ~(3 << 13)) | (ldt_info.dpl << 13);
+ }
 
         /* Install the new entry ... */
 install:
+ *(lp+1) = 0;
+ wmb();
+
         *lp = entry_1;
+ wmb();
+
         *(lp+1) = entry_2;
         error = 0;
 
@@ -244,3 +303,22 @@ asmlinkage int sys_modify_ldt(int func,
         }
         return ret;
 }
+
+int LDT_handle_over_page_offset(mm_context_t* ctx)
+{
+ if(ctx->idt.opaque)
+ return -EPERM;
+ else if(ctx->bad_segments)
+ return 0;
+ else
+ {
+ int ret = 0;
+ down(&ctx->sem);
+ if(ctx->idt.opaque)
+ ret = -EPERM;
+ else
+ ctx->bad_segments = 1;
+ up(&ctx->sem);
+ return ret;
+ }
+}
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/arch/i386/kernel/Makefile linux-2.5.44_multiring/arch/i386/kernel/Makefile
--- linux-2.5.44/arch/i386/kernel/Makefile 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/arch/i386/kernel/Makefile 2002-10-27 00:44:03.000000000 +0200
@@ -9,7 +9,7 @@ export-objs := mca.o i386_ksyms.o ti
 obj-y := process.o semaphore.o signal.o entry.o traps.o irq.o vm86.o \
                 ptrace.o i8259.o ioport.o ldt.o setup.o time.o sys_i386.o \
                 pci-dma.o i386_ksyms.o i387.o bluesmoke.o dmi_scan.o \
- bootflag.o
+ bootflag.o multiring.o
 
 obj-y += cpu/
 obj-y += timers/
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/arch/i386/kernel/multiring.c linux-2.5.44_multiring/arch/i386/kernel/multiring.c
--- linux-2.5.44/arch/i386/kernel/multiring.c 1970-01-01 01:00:00.000000000 +0100
+++ linux-2.5.44_multiring/arch/i386/kernel/multiring.c 2002-10-27 03:18:02.000000000 +0100
@@ -0,0 +1,316 @@
+/*
+ * linux/kernel/multiring.c: support for multiple privilege rings
+ *
+ * Copyright (C) 2002 Luca Barbieri <ldb@ldb.ods.org>
+ */
+
+#include <linux/errno.h>
+#include <linux/sched.h>
+#include <linux/string.h>
+#include <linux/mm.h>
+#include <linux/smp.h>
+#include <linux/smp_lock.h>
+#include <linux/init.h>
+
+#include <asm/uaccess.h>
+#include <asm/idt.h>
+#include <asm/irq.h>
+#include <asm/multiring.h>
+
+#define MULTIRING_NR_VECTORS ((MULTIRING_AUTO_LAST_VECTOR - MULTIRING_AUTO_FIRST_VECTOR + 1) + (MULTIRING_SPECIAL_LAST_VECTOR - MULTIRING_SPECIAL_FIRST_VECTOR + 1) + 1)
+
+static unsigned char vec2idx[256];
+static unsigned char idx2vec[MULTIRING_NR_VECTORS];
+
+#ifdef CONFIG_SMP /* avoids "defined but not used" warning */
+static void flush_idt(void *null)
+{
+ if (current->active_mm)
+ load_IDT(&current->active_mm->context);
+}
+#endif
+
+static inline unsigned multiring_fix_selector(unsigned sel)
+{
+ unsigned tsel = sel | 3;
+ if(tsel == USER_DS_RPL(3))
+ return USER_DS_RPL(MULTIRING_USER_RING);
+ else if(tsel == USER_CS_RPL(3))
+ return USER_CS_RPL(MULTIRING_USER_RING);
+ else
+ return sel;
+}
+
+static inline void multiring_fix_selector_ptr(unsigned* sel)
+{
+ *sel = multiring_fix_selector(*sel);
+}
+
+static inline struct pt_regs* get_pt_regs(task_t* tsk)
+{
+ return (struct pt_regs*)tsk->thread.esp0 - 1;
+}
+
+void multiring_init_task(task_t* tsk)
+{
+ /* One-time initialization for multiring mode */
+ unsigned oldsel, newsel;
+ struct pt_regs* regs;
+
+ regs = get_pt_regs(tsk);
+
+#define multiring_fix_loaded_selector(selname) \
+ savesegment(selname, oldsel); \
+ newsel = multiring_fix_selector(oldsel); \
+ if(oldsel != newsel) \
+ loadsegment(selname, oldsel);
+
+ multiring_fix_selector_ptr((unsigned*)&regs->xds);
+ multiring_fix_selector_ptr((unsigned*)&regs->xes);
+ multiring_fix_selector_ptr((unsigned*)&regs->xcs);
+ multiring_fix_selector_ptr((unsigned*)&regs->xss);
+ if(tsk != current)
+ {
+ multiring_fix_selector_ptr((unsigned*)&tsk->thread.fs);
+ multiring_fix_selector_ptr((unsigned*)&tsk->thread.gs);
+ }
+ else
+ {
+ multiring_fix_loaded_selector(fs);
+ multiring_fix_loaded_selector(gs);
+ }
+#undef multiring_fix_loaded_selector
+}
+
+void load_IDT_nolock(mm_context_t* ctx, unsigned cpu)
+{
+ load_IDT_nolock_inline(current, ctx, cpu);
+}
+
+asmlinkage int sys_multiring(unsigned op, unsigned arg1, unsigned arg2, unsigned arg3)
+{
+ mm_context_t* ctx = &current->mm->context;
+ int ret;
+ unsigned i;
+ int cpu;
+ struct desc_struct gate;
+ struct desc_struct* idt;
+
+ /* Without WP, DPL 1 and 2 will bypass copy-on-write */
+ if(!boot_cpu_data.wp_works_ok)
+ return -ENOSYS;
+
+ if(op > MULTIRING_LAST_OP)
+ return -EOPNOTSUPP;
+
+ switch(op)
+ {
+ case MULTIRING_CHECK:
+ return current->thread.multiring_mode ? 0 : (ctx->bad_segments ? -EPERM : 1);
+
+ case MULTIRING_SET_ESPSS:
+ {
+ struct tss_struct *tss;
+ unsigned ring = arg1;
+ unsigned esp = arg2;
+ unsigned ss = arg3;
+ if(ring == 0)
+ return -EPERM;
+ if(ring >= 3 || ring != (ss & 3))
+ return -EINVAL;
+
+ current->thread.espss12[ring - 1].esp = esp;
+ current->thread.espss12[ring - 1].ss = ss;
+
+ cpu = get_cpu();
+ tss = init_tss + cpu;
+ tss->espss12[ring - 1].esp = esp;
+ tss->espss12[ring - 1].ss = ss;
+ put_cpu();
+
+ return 0;
+ }
+
+ case MULTIRING_GET_ESPSS:
+ {
+ unsigned ring = arg1;
+ unsigned* esp = (unsigned*)arg2;
+ unsigned* ss = (unsigned*)arg3;
+ if(ring == 0)
+ return -EPERM;
+ if(ring >= 3)
+ return -EINVAL;
+
+ if(put_user(current->thread.espss12[ring - 1].esp, esp)
+ || put_user(current->thread.espss12[ring - 1].ss, ss))
+ return -EFAULT;
+ return 0;
+ }
+
+ case MULTIRING_GET_RANGE:
+ {
+ unsigned first, last;
+ char* ptr = (char*)arg3;
+ if(arg1 <= arg2)
+ {
+ first = arg1;
+ last = arg2;
+ }
+ else
+ {
+ first = arg2;
+ last = arg1;
+ }
+
+ ret = (last - first + 1) * 8;
+ idt = kmap_read_idt_or_table(ctx);
+ if(copy_to_user(ptr, &idt[first], ret))
+ ret = -EFAULT;
+ kunmap_read_idt_or_table(ctx, idt);
+ return ret;
+ }
+
+ case MULTIRING_GET:
+ {
+ unsigned vec = arg1;
+ unsigned* ptr = (unsigned*)arg2;
+ ret = 0;
+ idt = kmap_read_idt_or_table(ctx);
+ if(copy_to_user(ptr, &idt[vec], 8))
+ ret = -EFAULT;
+ kunmap_read_idt_or_table(ctx, idt);
+ return ret;
+ }
+ }
+
+ if(!current->thread.multiring_mode)
+ {
+ if(op != MULTIRING_ELEVATE)
+ return -ENXIO;
+
+ if(ctx->bad_segments)
+ return -EPERM;
+
+ down(&ctx->sem);
+ ret = -EPERM;
+ if(ctx->bad_segments)
+ goto out_up;
+
+ if(!ctx->idt.opaque)
+ {
+ union idt idtu;
+ idt = alloc_idt(&idtu);
+ ret = -ENOMEM;
+ if(!idt)
+ goto out_up;
+ memcpy(idt, idt_table, IDT_SIZE);
+ idt[SYSCALL_VECTOR].b = (idt[SYSCALL_VECTOR].b &~ 0x6000) | (MULTIRING_USER_RING << 13);
+ kunmap_new_idt(&idtu, idt);
+
+ wmb();
+ ctx->idt = idtu;
+ wmb();
+
+#ifdef CONFIG_SMP
+ cpu = get_cpu();
+ if (current->mm->cpu_vm_mask != (1 << cpu))
+ smp_call_function(flush_idt, 0, 1, 1);
+ put_cpu();
+#endif
+ }
+ up(&ctx->sem);
+
+ cpu = get_cpu();
+ if(!current->thread.multiring_mode)
+ load_IDT_nolock(ctx, cpu);
+ put_cpu();
+ return 0;
+
+ out_up:
+ up(&ctx->sem);
+ return ret;
+ }
+
+ /* MULTIRING_SET or MULTIRING_COPY */
+ {
+ unsigned vec = arg1;
+ idt = kmap_write_idt(ctx);
+ if(vec == MULTIRING_VEC_FREE)
+ {
+ for(i = 0; i < MULTIRING_NR_VECTORS; ++i)
+ {
+ vec = idx2vec[i];
+ if(!idt[vec].a && !idt[vec].b)
+ goto found_free;
+ }
+ ret = -ENOSPC;
+ goto out_put_idt;
+ found_free:
+ }
+ else
+ {
+ ret = -EPERM;
+ if(vec2idx[vec] == (unsigned char)~0)
+ goto out_put_idt;
+ }
+
+ if(op == MULTIRING_COPY)
+ {
+ unsigned from = arg2;
+ ret = -EPERM;
+ if(vec2idx[from] == (unsigned char)~0)
+ goto out_put_idt;
+ gate = idt[from];
+ }
+ else
+ {
+ gate.a = arg2;
+ gate.b = arg3;
+ if(gate.b & 0x8000)
+ {
+ ret = -EINVAL;
+ if((gate.b & 0x10ff) || (~gate.b & 0x700) || !(gate.b & 0x6000)) /* reserved_is_bad || !trap_gate || dpl0 */
+ goto out_put_idt;
+ ret = -EPERM;
+ if(!(gate.a & 0x30000) && (gate.a >> 16) && (gate.a != idt_table[SYSCALL_VECTOR].a || (gate.b | 0xe000) != idt_table[SYSCALL_VECTOR].b)) /* seg_RPL == 0 && not_int80_syscall */
+ goto out_put_idt;
+ }
+ }
+
+ idt[vec].b = 0;
+ wmb();
+ idt[vec].a = gate.a;
+ wmb();
+ idt[vec].b = gate.b;
+ ret = vec;
+ }
+ out_put_idt:
+ kunmap_write_idt(ctx, idt);
+ return ret;
+}
+
+int __init init_multiring(void)
+{
+ unsigned idx;
+ unsigned vec;
+
+ memset(vec2idx, 0xff, sizeof(vec2idx));
+ idx = 0;
+ for(vec = MULTIRING_AUTO_FIRST_VECTOR; vec <= MULTIRING_AUTO_LAST_VECTOR; ++vec, ++idx)
+ idx2vec[idx] = vec;
+ for(vec = MULTIRING_SPECIAL_FIRST_VECTOR; vec <= MULTIRING_SPECIAL_LAST_VECTOR; ++vec, ++idx)
+ idx2vec[idx] = vec;
+ idx2vec[idx] = SYSCALL_VECTOR;
+ vec2idx[SYSCALL_VECTOR] = idx;
+
+ for(idx = 0; idx < (MULTIRING_NR_VECTORS - 1); ++idx)
+ {
+ vec = idx2vec[idx];
+ vec2idx[vec] = idx;
+ idt_table[vec].a = 0;
+ idt_table[vec].b = 0;
+ }
+ return 0;
+}
+
+__initcall(init_multiring);
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/arch/i386/kernel/process.c linux-2.5.44_multiring/arch/i386/kernel/process.c
--- linux-2.5.44/arch/i386/kernel/process.c 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/arch/i386/kernel/process.c 2002-10-27 02:34:41.000000000 +0200
@@ -40,6 +40,7 @@
 #include <asm/system.h>
 #include <asm/io.h>
 #include <asm/ldt.h>
+#include <asm/idt.h>
 #include <asm/processor.h>
 #include <asm/i387.h>
 #include <asm/desc.h>
@@ -252,6 +253,8 @@ void flush_thread(void)
          */
         clear_fpu(tsk);
         tsk->used_math = 0;
+ tsk->thread.multiring_mode = 0;
+ load_IDT(&tsk->mm->context);
 }
 
 void release_thread(struct task_struct *dead_task)
@@ -268,18 +271,13 @@ void release_thread(struct task_struct *
         }
 }
 
-/*
- * Save a segment.
- */
-#define savesegment(seg,value) \
- asm volatile("movl %%" #seg ",%0":"=m" (*(int *)&(value)))
-
 int copy_thread(int nr, unsigned long clone_flags, unsigned long esp,
         unsigned long unused,
         struct task_struct * p, struct pt_regs * regs)
 {
         struct pt_regs * childregs;
         struct task_struct *tsk;
+ tsk = current;
 
         childregs = ((struct pt_regs *) (THREAD_SIZE + (unsigned long) p->thread_info)) - 1;
         struct_cpy(childregs, regs);
@@ -289,13 +287,14 @@ int copy_thread(int nr, unsigned long cl
 
         p->thread.esp = (unsigned long) childregs;
         p->thread.esp0 = (unsigned long) (childregs+1);
+ p->thread.espss12[0] = tsk->thread.espss12[0];
+ p->thread.espss12[1] = tsk->thread.espss12[1];
 
         p->thread.eip = (unsigned long) ret_from_fork;
 
         savesegment(fs,p->thread.fs);
         savesegment(gs,p->thread.gs);
 
- tsk = current;
         unlazy_fpu(tsk);
         struct_cpy(&p->thread.i387, &tsk->thread.i387);
 
@@ -307,6 +306,8 @@ int copy_thread(int nr, unsigned long cl
                         IO_BITMAP_BYTES);
         }
 
+ p->thread.multiring_mode = (tsk->thread.multiring_mode && p->mm->context.idt.opaque) ? 1 : 0;
+
         /*
          * Set a new TLS for the child thread?
          */
@@ -319,6 +320,8 @@ int copy_thread(int nr, unsigned long cl
                         return -EFAULT;
                 if (LDT_empty(&info))
                         return -EINVAL;
+ if (LDT_handle_perm(&info, &p->mm->context))
+ return -EPERM;
 
                 idx = info.entry_number;
                 if (idx < GDT_ENTRY_TLS_MIN || idx > GDT_ENTRY_TLS_MAX)
@@ -327,6 +330,12 @@ int copy_thread(int nr, unsigned long cl
                 desc = p->thread.tls_array + idx - GDT_ENTRY_TLS_MIN;
                 desc->a = LDT_entry_a(&info);
                 desc->b = LDT_entry_b(&info);
+ if(unlikely(current->mm->context.idt.opaque))
+ {
+ if(unlikely(info.dpl == 0))
+ return -EINVAL;
+ desc->b = (desc->b & ~(3 << 13)) | (info.dpl << 13);
+ }
         }
         return 0;
 }
@@ -420,6 +429,10 @@ void __switch_to(struct task_struct *pre
          */
         tss->esp0 = next->esp0;
 
+ /* multiring */
+ tss->espss12[0] = next->espss12[0];
+ tss->espss12[1] = next->espss12[1];
+
         /*
          * Load the per-thread Thread-Local Storage descriptor.
          */
@@ -599,8 +612,12 @@ asmlinkage int sys_set_thread_area(struc
 
         if (copy_from_user(&info, u_info, sizeof(info)))
                 return -EFAULT;
- idx = info.entry_number;
 
+ if (LDT_handle_perm(&info, &current->mm->context))
+ return -EPERM;
+
+ idx = info.entry_number;
+
         /*
          * index -1 means the kernel should try to find and
          * allocate an empty descriptor:
@@ -618,20 +635,25 @@ asmlinkage int sys_set_thread_area(struc
 
         desc = t->tls_array + idx - GDT_ENTRY_TLS_MIN;
 
- /*
- * We must not get preempted while modifying the TLS.
- */
- cpu = get_cpu();
-
         if (LDT_empty(&info)) {
                 desc->a = 0;
                 desc->b = 0;
         } else {
                 desc->a = LDT_entry_a(&info);
                 desc->b = LDT_entry_b(&info);
+ if(unlikely(current->mm->context.idt.opaque))
+ {
+ if(unlikely(info.dpl == 0))
+ return -EINVAL;
+ desc->b = (desc->b & ~(3 << 13)) | (info.dpl << 13);
+ }
         }
- load_TLS(t, cpu);
 
+ /*
+ * We must not get preempted while modifying the TLS.
+ */
+ cpu = get_cpu();
+ load_TLS(t, cpu);
         put_cpu();
 
         return 0;
@@ -656,6 +678,7 @@ asmlinkage int sys_set_thread_area(struc
 #define GET_LIMIT_PAGES(desc) (((desc)->b >> 23) & 1)
 #define GET_PRESENT(desc) (((desc)->b >> 15) & 1)
 #define GET_USEABLE(desc) (((desc)->b >> 20) & 1)
+#define GET_DPL(desc) (((desc)->b >> 13) & 3)
 
 asmlinkage int sys_get_thread_area(struct user_desc *u_info)
 {
@@ -679,6 +702,7 @@ asmlinkage int sys_get_thread_area(struc
         info.limit_in_pages = GET_LIMIT_PAGES(desc);
         info.seg_not_present = !GET_PRESENT(desc);
         info.useable = GET_USEABLE(desc);
+ info.dpl = GET_DPL(desc);
 
         if (copy_to_user(u_info, &info, sizeof(info)))
                 return -EFAULT;
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/arch/i386/kernel/ptrace.c linux-2.5.44_multiring/arch/i386/kernel/ptrace.c
--- linux-2.5.44/arch/i386/kernel/ptrace.c 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/arch/i386/kernel/ptrace.c 2002-10-27 00:44:03.000000000 +0200
@@ -76,24 +76,24 @@ static int putreg(struct task_struct *ch
 {
         switch (regno >> 2) {
                 case FS:
- if (value && (value & 3) != 3)
+ if (value && !(value & 3))
                                 return -EIO;
                         child->thread.fs = value;
                         return 0;
                 case GS:
- if (value && (value & 3) != 3)
+ if (value && !(value & 3))
                                 return -EIO;
                         child->thread.gs = value;
                         return 0;
                 case DS:
                 case ES:
- if (value && (value & 3) != 3)
+ if (value && !(value & 3))
                                 return -EIO;
                         value &= 0xffff;
                         break;
                 case SS:
                 case CS:
- if ((value & 3) != 3)
+ if (!(value & 3))
                                 return -EIO;
                         value &= 0xffff;
                         break;
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/arch/i386/kernel/signal.c linux-2.5.44_multiring/arch/i386/kernel/signal.c
--- linux-2.5.44/arch/i386/kernel/signal.c 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/arch/i386/kernel/signal.c 2002-10-27 00:44:03.000000000 +0200
@@ -162,7 +162,7 @@ restore_sigcontext(struct pt_regs *regs,
 #define COPY_SEG_STRICT(seg) \
         { unsigned short tmp; \
           err |= __get_user(tmp, &sc->seg); \
- regs->x##seg = tmp|3; }
+ regs->x##seg = (tmp & 3) ? tmp : (tmp | USER_RING); }
 
 #define GET_SEG(seg) \
         { unsigned short tmp; \
@@ -338,7 +338,7 @@ get_sigframe(struct k_sigaction *ka, str
         }
 
         /* This is the legacy signal stack switching. */
- else if ((regs->xss & 0xffff) != __USER_DS &&
+ else if ((regs->xss & 0xfffc) != USER_DS_RPL(0) &&
                  !(ka->sa.sa_flags & SA_RESTORER) &&
                  ka->sa.sa_restorer) {
                 esp = (unsigned long) ka->sa.sa_restorer;
@@ -350,6 +350,7 @@ get_sigframe(struct k_sigaction *ka, str
 static void setup_frame(int sig, struct k_sigaction *ka,
                         sigset_t *set, struct pt_regs * regs)
 {
+ unsigned ring;
         struct sigframe *frame;
         int err = 0;
 
@@ -398,10 +399,10 @@ static void setup_frame(int sig, struct
         regs->eip = (unsigned long) ka->sa.sa_handler;
 
         set_fs(USER_DS);
- regs->xds = __USER_DS;
- regs->xes = __USER_DS;
- regs->xss = __USER_DS;
- regs->xcs = __USER_CS;
+ ring = get_user_ring();
+ regs->xds = regs->xes = regs->xss = USER_DS_RPL(ring);
+ regs->xcs = USER_CS_RPL(ring);
+ put_user_ring();
         regs->eflags &= ~TF_MASK;
 
 #if DEBUG_SIG
@@ -422,6 +423,7 @@ static void setup_rt_frame(int sig, stru
 {
         struct rt_sigframe *frame;
         int err = 0;
+ unsigned ring;
 
         frame = get_sigframe(ka, regs, sizeof(*frame));
 
@@ -473,10 +475,10 @@ static void setup_rt_frame(int sig, stru
         regs->eip = (unsigned long) ka->sa.sa_handler;
 
         set_fs(USER_DS);
- regs->xds = __USER_DS;
- regs->xes = __USER_DS;
- regs->xss = __USER_DS;
- regs->xcs = __USER_CS;
+ ring = get_user_ring();
+ regs->xds = regs->xes = regs->xss = USER_DS_RPL(ring);
+ regs->xcs = USER_CS_RPL(ring);
+ put_user_ring();
         regs->eflags &= ~TF_MASK;
 
 #if DEBUG_SIG
@@ -556,7 +558,7 @@ int do_signal(struct pt_regs *regs, sigs
          * kernel mode. Just return without doing anything
          * if so.
          */
- if ((regs->xcs & 3) != 3)
+ if (!(regs->xcs & 3))
                 return 1;
 
         if (current->flags & PF_FREEZE) {
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/arch/i386/kernel/traps.c linux-2.5.44_multiring/arch/i386/kernel/traps.c
--- linux-2.5.44/arch/i386/kernel/traps.c 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/arch/i386/kernel/traps.c 2002-10-27 00:44:03.000000000 +0200
@@ -49,6 +49,8 @@
 #include <linux/irq.h>
 #include <linux/module.h>
 
+#include <asm/idt.h>
+
 asmlinkage int system_call(void);
 asmlinkage void lcall7(void);
 asmlinkage void lcall27(void);
@@ -62,6 +64,10 @@ struct desc_struct default_ldt[] = { { 0
  * for this.
  */
 struct desc_struct idt_table[256] __attribute__((__section__(".data.idt"))) = { {0, 0}, };
+#if defined(CONFIG_HIGHMEM) || defined(CONFIG_X86_F00F_BUG)
+pte_t* idt_pte;
+pte_t idt_table_pte;
+#endif
 
 asmlinkage void divide_error(void);
 asmlinkage void debug(void);
@@ -249,6 +255,30 @@ bad:
         printk("\n");
 }
 
+void __init load_idt_table_init(unsigned cpu, pgprot_t prot)
+{
+ struct Xgt_desc_struct map_idt_descr;
+
+ if(use_highidt
+#ifdef CONFIG_X86_F00F_BUG
+ || (pgprot_val(prot) == __PAGE_KERNEL_RO)
+#endif
+ )
+ {
+ unsigned idt_vstart;
+ idt_vstart = __fix_to_virt(FIX_IDT_BEGIN);
+ idt_pte = pte_offset_kernel(pmd_offset(pgd_offset_k(idt_vstart), (idt_vstart)), (idt_vstart));
+
+ idt_table_pte = pfn_pte(__pa(idt_table) >> PAGE_SHIFT, prot);
+ set_pte(idt_pte - cpu, idt_table_pte);
+ map_idt_descr.size = IDT_SIZE - 1;
+ map_idt_descr.address = __fix_to_virt(FIX_IDT_BEGIN + cpu);
+ __asm__ __volatile__("lidt %0": "=m" (map_idt_descr));
+ }
+ else
+ __asm__ __volatile__("lidt %0": "=m" (idt_descr));
+}
+
 static void handle_BUG(struct pt_regs *regs)
 {
         unsigned short ud2;
@@ -814,20 +844,6 @@ asmlinkage void math_emulate(long arg)
 
 #endif /* CONFIG_MATH_EMULATION */
 
-#ifdef CONFIG_X86_F00F_BUG
-void __init trap_init_f00f_bug(void)
-{
- __set_fixmap(FIX_F00F_IDT, __pa(&idt_table), PAGE_KERNEL_RO);
-
- /*
- * Update the IDT descriptor and reload the IDT so that
- * it uses the read-only mapped virtual address.
- */
- idt_descr.address = fix_to_virt(FIX_F00F_IDT);
- __asm__ __volatile__("lidt %0": "=m" (idt_descr));
-}
-#endif
-
 #define _set_gate(gate_addr,type,dpl,addr) \
 do { \
   int __d0, __d1; \
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/arch/i386/mach-generic/irq_vectors.h linux-2.5.44_multiring/arch/i386/mach-generic/irq_vectors.h
--- linux-2.5.44/arch/i386/mach-generic/irq_vectors.h 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/arch/i386/mach-generic/irq_vectors.h 2002-10-27 00:44:03.000000000 +0200
@@ -22,16 +22,19 @@
 #ifndef _ASM_IRQ_VECTORS_H
 #define _ASM_IRQ_VECTORS_H
 
+#define MULTIRING_SPECIAL_FIRST_VECTOR 0x20
+#define MULTIRING_SPECIAL_LAST_VECTOR 0x2f
+
 /*
  * IDT vectors usable for external interrupt sources start
- * at 0x20:
+ * at 0x30:
  */
-#define FIRST_EXTERNAL_VECTOR 0x20
+#define FIRST_EXTERNAL_VECTOR 0x30
 
 #define SYSCALL_VECTOR 0x80
 
 /*
- * Vectors 0x20-0x2f are used for ISA interrupts.
+ * Vectors 0x30-0x3f are used for ISA interrupts.
  */
 
 /*
@@ -49,6 +52,9 @@
 #define RESCHEDULE_VECTOR 0xfc
 #define CALL_FUNCTION_VECTOR 0xfb
 
+#define MULTIRING_AUTO_LAST_VECTOR 0xfa
+#define MULTIRING_AUTO_FIRST_VECTOR 0xf1
+
 #define THERMAL_APIC_VECTOR 0xf0
 /*
  * Local APIC timer IRQ vector is on a different priority level,
@@ -58,26 +64,26 @@
 #define LOCAL_TIMER_VECTOR 0xef
 
 /*
- * First APIC vector available to drivers: (vectors 0x30-0xee)
- * we start at 0x31 to spread out vectors evenly between priority
+ * First APIC vector available to drivers: (vectors 0x41-0xee)
+ * we start at 0x41 to spread out vectors evenly between priority
  * levels. (0x80 is the syscall vector)
  */
-#define FIRST_DEVICE_VECTOR 0x31
+#define FIRST_DEVICE_VECTOR 0x41
 #define FIRST_SYSTEM_VECTOR 0xef
 
 #define TIMER_IRQ 0
 
 /*
- * 16 8259A IRQ's, 208 potential APIC interrupt sources.
+ * 16 8259A IRQ's, 192 potential APIC interrupt sources.
  * Right now the APIC is mostly only used for SMP.
  * 256 vectors is an architectural limit. (we can have
  * more than 256 devices theoretically, but they will
  * have to use shared interrupts)
  * Since vectors 0x00-0x1f are used/reserved for the CPU,
- * the usable vector space is 0x20-0xff (224 vectors)
+ * the usable vector space is 0x30-0xff (208 vectors)
  */
 #ifdef CONFIG_X86_IO_APIC
-#define NR_IRQS 224
+#define NR_IRQS 208
 #else
 #define NR_IRQS 16
 #endif
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/arch/i386/mach-visws/irq_vectors.h linux-2.5.44_multiring/arch/i386/mach-visws/irq_vectors.h
--- linux-2.5.44/arch/i386/mach-visws/irq_vectors.h 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/arch/i386/mach-visws/irq_vectors.h 2002-10-27 00:44:03.000000000 +0200
@@ -1,11 +1,14 @@
 #ifndef _ASM_IRQ_VECTORS_H
 #define _ASM_IRQ_VECTORS_H
 
+#define MULTIRING_SPECIAL_FIRST_VECTOR 0x20
+#define MULTIRING_SPECIAL_LAST_VECTOR 0x2f
+
 /*
  * IDT vectors usable for external interrupt sources start
- * at 0x20:
+ * at 0x30:
  */
-#define FIRST_EXTERNAL_VECTOR 0x20
+#define FIRST_EXTERNAL_VECTOR 0x30
 
 #define SYSCALL_VECTOR 0x80
 
@@ -28,6 +31,9 @@
 #define RESCHEDULE_VECTOR 0xfc
 #define CALL_FUNCTION_VECTOR 0xfb
 
+#define MULTIRING_AUTO_LAST_VECTOR 0xfa
+#define MULTIRING_AUTO_FIRST_VECTOR 0xf1
+
 #define THERMAL_APIC_VECTOR 0xf0
 /*
  * Local APIC timer IRQ vector is on a different priority level,
@@ -37,26 +43,26 @@
 #define LOCAL_TIMER_VECTOR 0xef
 
 /*
- * First APIC vector available to drivers: (vectors 0x30-0xee)
- * we start at 0x31 to spread out vectors evenly between priority
+ * First APIC vector available to drivers: (vectors 0x41-0xee)
+ * we start at 0x41 to spread out vectors evenly between priority
  * levels. (0x80 is the syscall vector)
  */
-#define FIRST_DEVICE_VECTOR 0x31
+#define FIRST_DEVICE_VECTOR 0x41
 #define FIRST_SYSTEM_VECTOR 0xef
 
 #define TIMER_IRQ 0
 
 /*
- * 16 8259A IRQ's, 208 potential APIC interrupt sources.
+ * 16 8259A IRQ's, 192 potential APIC interrupt sources.
  * Right now the APIC is mostly only used for SMP.
  * 256 vectors is an architectural limit. (we can have
  * more than 256 devices theoretically, but they will
  * have to use shared interrupts)
  * Since vectors 0x00-0x1f are used/reserved for the CPU,
- * the usable vector space is 0x20-0xff (224 vectors)
+ * the usable vector space is 0x30-0xff (208 vectors)
  */
 #ifdef CONFIG_X86_IO_APIC
-#define NR_IRQS 224
+#define NR_IRQS 208
 #else
 #define NR_IRQS 16
 #endif
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/arch/i386/math-emu/fpu_entry.c linux-2.5.44_multiring/arch/i386/math-emu/fpu_entry.c
--- linux-2.5.44/arch/i386/math-emu/fpu_entry.c 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/arch/i386/math-emu/fpu_entry.c 2002-10-27 00:44:03.000000000 +0200
@@ -171,7 +171,7 @@ asmlinkage void math_emulate(long arg)
       FPU_EIP += code_base = FPU_CS << 4;
       code_limit = code_base + 0xffff; /* Assumes code_base <= 0xffff0000 */
     }
- else if ( FPU_CS == __USER_CS && FPU_DS == __USER_DS )
+ else if ( (FPU_CS | 3) == USER_CS_RPL(3) && (FPU_DS | 3) == USER_DS_RPL(3) )
     {
       addr_modes.default_mode = 0;
     }
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/arch/i386/mm/fault.c linux-2.5.44_multiring/arch/i386/mm/fault.c
--- linux-2.5.44/arch/i386/mm/fault.c 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/arch/i386/mm/fault.c 2002-10-27 00:44:03.000000000 +0200
@@ -157,6 +157,10 @@ asmlinkage void do_page_fault(struct pt_
 
         tsk = current;
 
+ /* User code at DPL 1 and 2 will also cause the U/S bit to be unset */
+ if(current->thread.multiring_mode && (regs->xcs & 3))
+ error_code |= 4;
+
         /*
          * We fault-in kernel-space virtual memory on-demand. The
          * 'reference' page table is init_mm.pgd.
@@ -270,6 +274,7 @@ bad_area:
         up_read(&mm->mmap_sem);
 
         /* User mode accesses just cause a SIGSEGV */
+ /* Note: F0 0F C7 C8 in DPL 1 or 2 will cause the if body to be executed (seems unavoidable) */
         if (error_code & 4) {
                 tsk->thread.cr2 = address;
                 tsk->thread.error_code = error_code;
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/fs/exec.c linux-2.5.44_multiring/fs/exec.c
--- linux-2.5.44/fs/exec.c 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/fs/exec.c 2002-10-27 02:14:07.000000000 +0200
@@ -1021,7 +1021,11 @@ int do_execve(char * filename, char ** a
         if (!bprm.mm)
                 goto out_file;
 
+#ifdef init_new_context_flags
+ retval = init_new_context_flags(current, bprm.mm, CLONE_EXEC);
+#else
         retval = init_new_context(current, bprm.mm);
+#endif
         if (retval < 0)
                 goto out_mm;
 
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/include/asm-i386/desc.h linux-2.5.44_multiring/include/asm-i386/desc.h
--- linux-2.5.44/include/asm-i386/desc.h 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/include/asm-i386/desc.h 2002-10-27 00:44:03.000000000 +0200
@@ -3,12 +3,16 @@
 
 #include <asm/ldt.h>
 #include <asm/segment.h>
+#include <asm/page.h>
+
+#define USER_CSDS_A (((__PAGE_OFFSET - 1) >> PAGE_SHIFT) & 0xffff)
+#define USER_CSDS_B(data, ring) (0x00c09200 | (((__PAGE_OFFSET - 1) >> PAGE_SHIFT) & 0xf0000) | ((!(data)) << 11) | ((ring) << 13))
 
 #ifndef __ASSEMBLY__
 
 #include <asm/mmu.h>
 
-extern struct desc_struct cpu_gdt_table[NR_CPUS][GDT_ENTRIES];
+extern struct desc_struct cpu_gdt_table[NR_CPUS][GDT_ENTRIES], idt_table[IDT_ENTRIES];
 
 struct Xgt_desc_struct {
         unsigned short size;
@@ -61,7 +65,7 @@ static inline void set_ldt_desc(unsigned
         ((info)->seg_32bit << 22) | \
         ((info)->limit_in_pages << 23) | \
         ((info)->useable << 20) | \
- 0x7000)
+ 0x1000 | (USER_RING << 13))
 
 #define LDT_empty(info) (\
         (info)->base_addr == 0 && \
@@ -73,6 +77,35 @@ static inline void set_ldt_desc(unsigned
         (info)->seg_not_present == 1 && \
         (info)->useable == 0 )
 
+extern int LDT_handle_over_page_offset(mm_context_t* ctx);
+
+static inline int LDT_handle_perm(struct user_desc* info, mm_context_t* ctx)
+{
+ unsigned limit;
+ unsigned maxlim;
+ if(info->base_addr >= __PAGE_OFFSET)
+ return LDT_handle_over_page_offset(ctx);
+
+ limit = info->limit & 0xfffff;
+ if(info->limit_in_pages)
+ limit = (limit << PAGE_SHIFT) + (PAGE_SIZE - 1);
+
+ maxlim = (__PAGE_OFFSET - 1) - info->base_addr;
+ if(limit > maxlim)
+ {
+ if(maxlim <= 0xfffff)
+ {
+ info->limit = maxlim;
+ info->limit_in_pages = 0;
+ }
+ else if(!(info->base_addr & ~PAGE_MASK))
+ info->limit = maxlim >> PAGE_SHIFT;
+ else
+ return LDT_handle_over_page_offset(ctx);
+ }
+ return 0;
+}
+
 #if TLS_SIZE != 24
 # error update this code.
 #endif
@@ -117,6 +150,17 @@ static inline void load_LDT(mm_context_t
         put_cpu();
 }
 
+static inline unsigned get_user_ring(void)
+{
+ preempt_disable();
+ return unlikely(current->thread.multiring_mode) ? MULTIRING_USER_RING : USER_RING;
+}
+
+static inline void put_user_ring(void)
+{
+ preempt_enable();
+}
+
 #endif /* !__ASSEMBLY__ */
 
 #endif
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/include/asm-i386/fixmap.h linux-2.5.44_multiring/include/asm-i386/fixmap.h
--- linux-2.5.44/include/asm-i386/fixmap.h 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/include/asm-i386/fixmap.h 2002-10-27 00:44:03.000000000 +0200
@@ -48,6 +48,10 @@
  * future, say framebuffers for the console driver(s) could be
  * fix-mapped?
  */
+
+extern int __fix_idt_begin_should_have_been_optimized_away(void);
+extern int __fix_idt_end_should_have_been_optimized_away(void);
+
 enum fixed_addresses {
 #ifdef CONFIG_X86_LOCAL_APIC
         FIX_APIC_BASE, /* local (CPU) APIC) -- required for SMP or not */
@@ -62,9 +66,15 @@ enum fixed_addresses {
         FIX_LI_PCIA, /* Lithium PCI Bridge A */
         FIX_LI_PCIB, /* Lithium PCI Bridge B */
 #endif
-#ifdef CONFIG_X86_F00F_BUG
- FIX_F00F_IDT, /* Virtual mapping for IDT */
+
+#if defined(CONFIG_X86_F00F_BUG) || defined(CONFIG_X86_HIGHIDT)
+ FIX_IDT_BEGIN, /* Virtual mapping for IDT */
+ FIX_IDT_END = FIX_IDT_BEGIN + NR_CPUS - 1,
+#else
+#define FIX_IDT_BEGIN __fix_idt_begin_should_have_been_optimized_away()
+#define FIX_IDT_END __fix_idt_end_should_have_been_optimized_away()
 #endif
+
 #ifdef CONFIG_X86_CYCLONE
         FIX_CYCLONE_TIMER, /*cyclone timer register*/
 #endif
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/include/asm-i386/idt.h linux-2.5.44_multiring/include/asm-i386/idt.h
--- linux-2.5.44/include/asm-i386/idt.h 1970-01-01 01:00:00.000000000 +0100
+++ linux-2.5.44_multiring/include/asm-i386/idt.h 2002-10-27 03:18:13.000000000 +0100
@@ -0,0 +1,236 @@
+/*
+ * linux/include/asm-i386/multiring.h: multiring IDT inline functions
+ *
+ * Copyright (C) 2002 Luca Barbieri <ldb@ldb.ods.org>
+ */
+
+#ifndef __i386_IDT_H
+#define __i386_IDT_H
+
+#include <linux/config.h>
+#include <linux/slab.h>
+#include <linux/gfp.h>
+#include <linux/smp.h>
+#include <linux/highmem.h>
+#include <asm/atomic.h>
+#include <linux/spinlock.h>
+
+#include <asm/desc.h>
+#include <asm/segment.h>
+#include <asm/mmu.h>
+#include <asm/pgtable.h>
+#include <asm/pgalloc.h>
+#include <asm/tlb.h>
+
+#ifdef CONFIG_X86_HIGHIDT
+#define use_highidt 1
+#else
+#define use_highidt 0
+#endif
+
+#ifdef CONFIG_X86_F00F_BUG
+#define cpu_has_f00f_bug boot_cpu_data.f00f_bug
+#else
+#define cpu_has_f00f_bug 0
+#endif
+
+extern pte_t* idt_pte;
+extern unsigned idt_prot;
+extern pte_t idt_table_pte;
+
+/* access to idt_refcnt */
+static inline struct desc_struct* kmap_idt(mm_context_t* ctx)
+{
+ if(use_highidt)
+ {
+ if(!cpu_has_f00f_bug)
+ {
+ unsigned cpu = get_cpu();
+ return (struct desc_struct*)__fix_to_virt(FIX_IDT_BEGIN + cpu);
+ }
+ else
+ {
+ /* we need to kmap because the window is read-only */
+ return kmap(ctx->idt.page);
+ }
+ }
+ return ctx->idt.addr;
+}
+
+static inline rwlock_t* idt_lock(struct desc_struct* idt)
+{
+ return (rwlock_t*)((char*)idt + IDT_SIZE);
+}
+
+/* read access */
+static inline struct desc_struct* kmap_read_idt(mm_context_t* ctx)
+{
+ struct desc_struct* idt = kmap_idt(ctx);
+ read_lock(idt_lock(idt));
+ return idt;
+}
+
+/* write access */
+static inline struct desc_struct* kmap_write_idt(mm_context_t* ctx)
+{
+ struct desc_struct* idt = kmap_idt(ctx);
+ write_lock(idt_lock(idt));
+ return idt;
+}
+
+static inline struct desc_struct* kmap_read_idt_or_table(mm_context_t* ctx)
+{
+ if(current->thread.multiring_mode)
+ return kmap_read_idt(ctx);
+ else
+ return idt_table;
+}
+
+static inline void kunmap_idt(mm_context_t* ctx, struct desc_struct* idt)
+{
+ if(use_highidt)
+ {
+ if(!cpu_has_f00f_bug)
+ put_cpu();
+ else
+ kunmap(ctx->idt.page);
+ }
+}
+
+static inline void kunmap_read_idt(mm_context_t* ctx, struct desc_struct* idt)
+{
+ read_unlock(idt_lock(idt));
+ kunmap_idt(ctx, idt);
+}
+
+static inline void kunmap_write_idt(mm_context_t* ctx, struct desc_struct* idt)
+{
+ write_unlock(idt_lock(idt));
+ kunmap_idt(ctx, idt);
+}
+
+static inline void kunmap_read_idt_or_table(mm_context_t* ctx, struct desc_struct* idt)
+{
+ if(idt != idt_table)
+ return kunmap_read_idt(ctx, idt);
+}
+
+static inline void load_idt_table(unsigned cpu)
+{
+ if(use_highidt || cpu_has_f00f_bug)
+ {
+ set_pte(idt_pte - cpu, idt_table_pte);
+ __flush_tlb_one(__fix_to_virt(FIX_IDT_BEGIN + cpu));
+ }
+ else
+ __asm__ __volatile__("lidt %0": "=m" (idt_descr));
+}
+
+static inline void change_gdt_ring(unsigned cpu, unsigned ring)
+{
+ cpu_gdt_table[cpu][GDT_ENTRY_DEFAULT_USER_CS].b = USER_CSDS_B(0, ring);
+ cpu_gdt_table[cpu][GDT_ENTRY_DEFAULT_USER_DS].b = USER_CSDS_B(1, ring);
+}
+
+extern void multiring_init_task(task_t* tsk);
+
+static inline void load_IDT_nolock_inline(task_t* tsk, mm_context_t* ctx, unsigned cpu)
+{
+ if(!ctx->idt.opaque)
+ {
+ load_idt_table(cpu);
+ change_gdt_ring(cpu, 3);
+ return;
+ }
+
+ change_gdt_ring(cpu, MULTIRING_USER_RING);
+ if(use_highidt || cpu_has_f00f_bug)
+ {
+ set_pte(idt_pte - cpu, cpu_has_f00f_bug ? pfn_pte(__pa(ctx->idt.addr) >> PAGE_SHIFT, PAGE_KERNEL_RO) : mk_pte(ctx->idt.page, PAGE_KERNEL));
+ __flush_tlb_one(__fix_to_virt(FIX_IDT_BEGIN + cpu));
+ }
+ else
+ {
+ struct Xgt_desc_struct map_idt_descr;
+ map_idt_descr.size = IDT_SIZE - 1;
+ map_idt_descr.address = (unsigned long)ctx->idt.addr;
+ __asm__ __volatile__("lidt %0": "=m" (map_idt_descr));
+ }
+ if(unlikely(!tsk->thread.multiring_mode))
+ {
+ multiring_init_task(tsk);
+ tsk->thread.multiring_mode = 1;
+ }
+}
+
+static inline void load_IDT_inline(task_t* tsk, mm_context_t* ctx)
+{
+ unsigned cpu = get_cpu();
+ load_IDT_nolock_inline(tsk, ctx, cpu);
+ put_cpu();
+}
+
+extern void load_IDT_nolock(mm_context_t* ctx, unsigned cpu);
+
+static inline void load_IDT(mm_context_t* ctx)
+{
+ unsigned cpu = get_cpu();
+ load_IDT_nolock(ctx, cpu);
+ put_cpu();
+}
+
+static inline atomic_t* idt_refcnt(struct desc_struct* idt)
+{
+ return (atomic_t*)((char*)idt + IDT_SIZE + sizeof(spinlock_t));
+}
+
+static inline struct desc_struct* __alloc_idt(union idt* idtu)
+{
+ if(use_highidt)
+ {
+ /* Use high memory */
+ idtu->page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM);
+ return (struct desc_struct*)kmap(idtu->page);
+ }
+ else
+ {
+ /* TODO: use an aligned 2KB allocator instead */
+ return idtu->addr = (struct desc_struct*)__get_free_page(GFP_KERNEL);
+ }
+}
+
+static inline struct desc_struct* alloc_idt(union idt* idtu)
+{
+ struct desc_struct* idt = __alloc_idt(idtu);
+ if(idt)
+ {
+ *idt_lock(idt) = RW_LOCK_UNLOCKED;
+ atomic_set(idt_refcnt(idt), 1);
+ }
+ return idt;
+}
+
+static inline void kunmap_new_idt(union idt* idtu, struct desc_struct* idt)
+{
+ if(use_highidt)
+ kunmap(idtu->page);
+}
+
+static inline void __free_idt(mm_context_t* ctx)
+{
+ if(use_highidt)
+ __free_page(ctx->idt.page);
+ else
+ free_page((unsigned long)ctx->idt.addr);
+}
+
+static inline void free_idt(mm_context_t* ctx)
+{
+ struct desc_struct* idt = kmap_idt(ctx);
+ int free = atomic_dec_and_test(idt_refcnt(idt));
+ kunmap_idt(ctx, idt);
+ if(free)
+ __free_idt(ctx);
+}
+
+#endif
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/include/asm-i386/ldt.h linux-2.5.44_multiring/include/asm-i386/ldt.h
--- linux-2.5.44/include/asm-i386/ldt.h 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/include/asm-i386/ldt.h 2002-10-27 00:44:03.000000000 +0200
@@ -22,6 +22,9 @@ struct user_desc {
         unsigned int limit_in_pages:1;
         unsigned int seg_not_present:1;
         unsigned int useable:1;
+
+ /* has effect only in multiring mode, but is returned in any mode */
+ unsigned int dpl:2;
 };
 
 #define MODIFY_LDT_CONTENTS_DATA 0
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/include/asm-i386/mmu_context.h linux-2.5.44_multiring/include/asm-i386/mmu_context.h
--- linux-2.5.44/include/asm-i386/mmu_context.h 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/include/asm-i386/mmu_context.h 2002-10-27 03:13:34.000000000 +0100
@@ -6,12 +6,15 @@
 #include <asm/atomic.h>
 #include <asm/pgalloc.h>
 #include <asm/tlbflush.h>
+#include <asm/idt.h>
 
 /*
  * possibly do the LDT unload here?
  */
 #define destroy_context(mm) do { } while(0)
-int init_new_context(struct task_struct *tsk, struct mm_struct *mm);
+int init_new_context_flags(struct task_struct *tsk, struct mm_struct *mm, unsigned flags);
+#define init_new_context_flags init_new_context_flags
+#define init_new_context(tsk, mm) init_new_context_flags(tsk, mm, 0)
 
 #ifdef CONFIG_SMP
 
@@ -45,6 +48,9 @@ static inline void switch_mm(struct mm_s
                  */
                 if (unlikely(prev->context.ldt != next->context.ldt))
                         load_LDT_nolock(&next->context, cpu);
+
+ if (unlikely(prev->context.idt.opaque != next->context.idt.opaque))
+ load_IDT_nolock_inline(tsk, &next->context, cpu);
         }
 #ifdef CONFIG_SMP
         else {
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/include/asm-i386/mmu.h linux-2.5.44_multiring/include/asm-i386/mmu.h
--- linux-2.5.44/include/asm-i386/mmu.h 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/include/asm-i386/mmu.h 2002-10-27 00:44:03.000000000 +0200
@@ -7,10 +7,20 @@
  *
  * cpu_vm_mask is used to optimize ldt flushing.
  */
+
+union idt
+{
+ struct desc_struct* addr;
+ struct page* page;
+ unsigned long opaque;
+};
+
 typedef struct {
         int size;
         struct semaphore sem;
         void *ldt;
+ union idt idt;
+ unsigned char bad_segments;
 } mm_context_t;
 
 #endif
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/include/asm-i386/multiring.h linux-2.5.44_multiring/include/asm-i386/multiring.h
--- linux-2.5.44/include/asm-i386/multiring.h 1970-01-01 01:00:00.000000000 +0100
+++ linux-2.5.44_multiring/include/asm-i386/multiring.h 2002-10-27 00:44:03.000000000 +0200
@@ -0,0 +1,194 @@
+/*
+ * linux/include/asm-i386/multiring.h: header for multiple privilege rings support (available to user-mode)
+ *
+ * Copyright (C) 2002 Luca Barbieri <ldb@ldb.ods.org>
+ */
+
+#ifndef __i386_MULTIRING_H
+#define __i386_MULTIRING_H
+
+#define MULTIRING_CHECK 0
+
+#define MULTIRING_GET_ESPSS 1
+#define MULTIRING_SET_ESPSS 2
+
+#define MULTIRING_GET 3
+#define MULTIRING_GET_RANGE 4
+
+#define MULTIRING_ELEVATE 5
+#define MULTIRING_SET 6
+#define MULTIRING_COPY 7
+#define MULTIRING_LAST_OP 7
+
+#define MULTIRING_VEC_FREE ((unsigned char)~0)
+
+#ifdef __KERNEL__
+#define multiring_gate desc_struct
+#else
+struct multiring_gate
+{
+ unsigned long a;
+ unsigned long b;
+};
+#endif
+
+#ifndef __KERNEL__
+#ifndef MULTIRING_NO_SYSCALLS
+#ifndef _syscall1
+#include <asm/unistd.h>
+#endif
+
+#define __NR_multiring0 __NR_multiring
+#define __NR_multiring1 __NR_multiring
+#define __NR_multiring2 __NR_multiring
+#define __NR_multiring3 __NR_multiring
+
+_syscall1(int, multiring0, unsigned, op);
+_syscall2(int, multiring1, unsigned, op, unsigned long, arg1);
+_syscall3(int, multiring2, unsigned, op, unsigned long, arg1, unsigned long, arg2);
+_syscall4(int, multiring3, unsigned, op, unsigned long, arg1, unsigned long, arg2, unsigned long, arg3);
+#endif
+
+#ifndef MULTIRING_NO_HELPERS
+/*
+ Check whether the program is in multiring mode.
+
+ Return value:
+ 0: multiring mode
+ 1: normal mode
+ -1/EPERM: normal mode; can't enter multiring mode due to a bad segment
+*/
+static inline int multiring_check()
+{
+ return multiring0(MULTIRING_CHECK);
+}
+
+/* Enters multiring mode.
+ This fails if you have previously set up a "bad segment" with modify_ldt, set_thread_area or CLONE_SETTLS.
+ A bad segment is a segment that either starts in the kernel memory region or has a limit expressed in pages, with a value that causes it to contain the kernel region and with a non-page-aligned base address.
+
+ This will cause the process to get a private IDT that starts as a copy of the global one, but with the SYSCALL_VECTOR DPL set to 1.
+ Multiring processes also have the DPL on the GDT CS and DS descriptors set to 1.
+ Upon return, in all threads, cs/ds/es/ss/fs/gs selectors pointing to the default CS/DS will be changed to have RPL=1.
+
+ Return value:
+ 0: success
+ -1/EPERM: can't enter multiring mode due to a bad segment
+ -1/ENOMEM: couldn't allocate the IDT
+*/
+static inline int multiring_elevate()
+{
+ return multiring0(MULTIRING_ELEVATE);
+}
+
+/*
+ Return TSS ESP/SS value for a privilege ring
+
+ Parameters:
+ ring: privilege ring whose tss esp/ss you want to get
+ esp: pointer to returned tss esp value
+ ss: pointer to returned tss ss value
+*/
+static inline int multiring_get_espss(unsigned ring, unsigned* esp, unsigned* ss)
+{
+ return multiring3(MULTIRING_GET_ESPSS, ring, (unsigned long)esp, (unsigned long)ss);
+}
+
+/*
+ Set TSS ESP/SS value for a privilege ring
+ Note: while multiring mode is currently not needed to do this, you shouldn't rely on this.
+
+ Parameters:
+ ring: privilege ring whose tss esp/ss you want to set
+ esp: new tss esp value
+ ss: new tss ss value - must have RPL == ring
+*/
+static inline int multiring_set_espss(unsigned ring, unsigned esp, unsigned ss)
+{
+ return multiring3(MULTIRING_SET_ESPSS, ring, esp, ss);
+}
+
+/*
+ Get a single interrupt gate.
+
+ Parameters:
+ vec: vector number
+ gate: pointer to returned multiring_gate
+*/
+static inline int multiring_get(unsigned vec, struct multiring_gate* gate)
+{
+ return multiring2(MULTIRING_GET, vec, (unsigned long)gate);
+}
+
+/*
+ Get a range of interrupt gates.
+
+ Parameters:
+ first: first vector number
+ last: last vector numer
+ gates: pointer to returned multiring_gate structs (size must be >= (last - first + 1) * sizeof(struct multiring_gate))
+*/
+static inline int multiring_get_range(unsigned first, unsigned last, struct multiring_gate* gates)
+{
+ return multiring3(MULTIRING_GET_RANGE, first, last, (unsigned long)gates);
+}
+
+/*
+ Set an interrupt gate.
+
+ Parameters:
+ vec: vector number or MULTIRING_VEC_FREE to get a free one (or ENOSPC if none found)
+ a: first 32-bit word of interrupt gate
+ b: second 32-bit word of interrupt gate
+
+ Return value:
+ >= 0: success, return value is vector number
+ -1/EPERM: you tried to set an unmodifiable vector or to create a gate to a non-syscall RPL 0 address or there was an another permission denied error
+ -1/ENXIO: you weren't in multiring mode
+ -1/EINVAL: you tried to create a task/interrupt gate, a DPL 0 gate, a gate with reserved regions with wrong value, or something else considered invalid/not supported
+ -1/ENOSPC: you specified MULTIRING_VEC_FREE but there was no available free vector
+*/
+static inline int multiring_set(unsigned vec, unsigned long a, unsigned long b)
+{
+ return multiring3(MULTIRING_SET, vec, a, b);
+}
+
+/* like multiring_set with vec=MULTIRING_VEC_FREE */
+static inline int multiring_set_free(unsigned long a, unsigned long b)
+{
+ return multiring_set(MULTIRING_VEC_FREE, a, b);
+}
+
+/* like multiring_set with a=0 b=0 */
+static inline int multiring_free(unsigned vec)
+{
+ return multiring_set(vec, 0, 0);
+}
+
+/*
+ Set an interrupt gate based on another one
+
+ Parameters:
+ vec: vector number or MULTIRING_VEC_FREE to get a free one (or ENOSPC if none found)
+ from: vector number to copy from
+
+ Return value:
+ >= 0: success, return value is vector number
+ -1/ENXIO: you weren't in multiring mode
+ -1/EPERM: you tried to set an unmodifiable vector or to create a gate to a non-syscall RPL 0 address or there was an another permission denied error
+ -1/ENOSPC: you specified MULTIRING_VEC_FREE but there was no available free vector
+*/
+static inline int multiring_copy(unsigned vec, unsigned from)
+{
+ return multiring2(MULTIRING_COPY, vec, from);
+}
+
+/* like multiring_copy with vec=MULTIRING_VEC_FREE */
+static inline int multiring_copy_free(unsigned from)
+{
+ return multiring_copy(MULTIRING_VEC_FREE, from);
+}
+
+#endif
+#endif
+#endif
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/include/asm-i386/processor.h linux-2.5.44_multiring/include/asm-i386/processor.h
--- linux-2.5.44/include/asm-i386/processor.h 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/include/asm-i386/processor.h 2002-10-27 00:44:04.000000000 +0200
@@ -332,14 +332,17 @@ typedef struct {
         unsigned long seg;
 } mm_segment_t;
 
+struct espss
+{
+ unsigned long esp;
+ unsigned long ss;
+};
+
 struct tss_struct {
         unsigned short back_link,__blh;
         unsigned long esp0;
         unsigned short ss0,__ss0h;
- unsigned long esp1;
- unsigned short ss1,__ss1h;
- unsigned long esp2;
- unsigned short ss2,__ss2h;
+ struct espss espss12[2];
         unsigned long __cr3;
         unsigned long eip;
         unsigned long eflags;
@@ -367,10 +370,12 @@ struct thread_struct {
 /* cached TLS descriptors. */
         struct desc_struct tls_array[GDT_ENTRY_TLS_ENTRIES];
         unsigned long esp0;
+ struct espss espss12[2];
         unsigned long eip;
         unsigned long esp;
         unsigned long fs;
         unsigned long gs;
+ unsigned char multiring_mode;
 /* Hardware debugging registers */
         unsigned long debugreg[8]; /* %%db0-7 debug registers */
 /* fault info */
@@ -388,7 +393,8 @@ struct thread_struct {
 #define INIT_THREAD { \
         { { 0, 0 } , }, \
         0, \
- 0, 0, 0, 0, \
+ { {0, 0}, {0, 0} }, \
+ 0, 0, 0, 0, 0, \
         { [0 ... 7] = 0 }, /* debugging registers */ \
         0, 0, 0, \
         { { 0, }, }, /* 387 state */ \
@@ -400,7 +406,7 @@ struct thread_struct {
         0,0, /* back_link, __blh */ \
         sizeof(init_stack) + (long) &init_stack, /* esp0 */ \
         __KERNEL_DS, 0, /* ss0 */ \
- 0,0,0,0,0,0, /* stack1, stack2 */ \
+ { {0, 0}, {0, 0} }, /* stack1, stack2 */ \
         0, /* cr3 */ \
         0,0, /* eip,eflags */ \
         0,0,0,0, /* eax,ecx,edx,ebx */ \
@@ -415,10 +421,10 @@ struct thread_struct {
 #define start_thread(regs, new_eip, new_esp) do { \
         __asm__("movl %0,%%fs ; movl %0,%%gs": :"r" (0)); \
         set_fs(USER_DS); \
- regs->xds = __USER_DS; \
- regs->xes = __USER_DS; \
- regs->xss = __USER_DS; \
- regs->xcs = __USER_CS; \
+ regs->xds = USER_DS_RPL(USER_RING); \
+ regs->xes = USER_DS_RPL(USER_RING); \
+ regs->xss = USER_DS_RPL(USER_RING); \
+ regs->xcs = USER_CS_RPL(USER_RING); \
         regs->eip = new_eip; \
         regs->esp = new_esp; \
 } while (0)
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/include/asm-i386/segment.h linux-2.5.44_multiring/include/asm-i386/segment.h
--- linux-2.5.44/include/asm-i386/segment.h 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/include/asm-i386/segment.h 2002-10-27 00:44:04.000000000 +0200
@@ -1,6 +1,12 @@
 #ifndef _ASM_SEGMENT_H
 #define _ASM_SEGMENT_H
 
+/* The default user privilege level */
+#define USER_RING 3
+
+/* The multiring most privileged user level */
+#define MULTIRING_USER_RING 1
+
 /*
  * The layout of the per-CPU GDT under Linux:
  *
@@ -43,10 +49,10 @@
 #define TLS_SIZE (GDT_ENTRY_TLS_ENTRIES * 8)
 
 #define GDT_ENTRY_DEFAULT_USER_CS 4
-#define __USER_CS (GDT_ENTRY_DEFAULT_USER_CS * 8 + 3)
+#define USER_CS_RPL(ring) (GDT_ENTRY_DEFAULT_USER_CS * 8 + (ring))
 
 #define GDT_ENTRY_DEFAULT_USER_DS 5
-#define __USER_DS (GDT_ENTRY_DEFAULT_USER_DS * 8 + 3)
+#define USER_DS_RPL(ring) (GDT_ENTRY_DEFAULT_USER_DS * 8 + (ring))
 
 #define GDT_ENTRY_KERNEL_BASE 12
 
@@ -76,4 +82,6 @@
  */
 #define IDT_ENTRIES 256
 
+#define IDT_SIZE (IDT_ENTRIES * 8)
+
 #endif
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/include/asm-i386/system.h linux-2.5.44_multiring/include/asm-i386/system.h
--- linux-2.5.44/include/asm-i386/system.h 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/include/asm-i386/system.h 2002-10-27 00:44:04.000000000 +0200
@@ -95,6 +95,12 @@ static inline unsigned long _get_base(ch
                 : :"m" (*(unsigned int *)&(value)))
 
 /*
+ * Save a segment.
+ */
+#define savesegment(seg,value) \
+ asm volatile("movl %%" #seg ",%0":"=m" (*(int *)&(value)))
+
+/*
  * Clear and set 'TS' bit respectively
  */
 #define clts() __asm__ __volatile__ ("clts")
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/include/asm-i386/unistd.h linux-2.5.44_multiring/include/asm-i386/unistd.h
--- linux-2.5.44/include/asm-i386/unistd.h 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/include/asm-i386/unistd.h 2002-10-27 00:44:04.000000000 +0200
@@ -258,6 +258,7 @@
 #define __NR_free_hugepages 251
 #define __NR_exit_group 252
 #define __NR_lookup_dcookie 253
+#define __NR_multiring 254
   
 
 /* user-visible error numbers are in the range -1 - -124: see <asm-i386/errno.h> */
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/include/linux/sched.h linux-2.5.44_multiring/include/linux/sched.h
--- linux-2.5.44/include/linux/sched.h 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/include/linux/sched.h 2002-10-27 02:36:47.000000000 +0200
@@ -51,6 +51,14 @@ struct exec_domain;
 #define CLONE_SETTID 0x00100000 /* write the TID back to userspace */
 #define CLONE_CLEARTID 0x00200000 /* clear the userspace TID */
 #define CLONE_DETACHED 0x00400000 /* parent wants no child-exit signal */
+#ifdef __i386__
+#define CLONE_IDT 0x00800000 /* set if IDT is shared between processes (note: CLONE_VM implies CLONE_IDT) */
+#define CLONE_CLEAR_IDT 0x01000000 /* set to clear the IDT */
+#define CLONE_EXEC CLONE_CLEAR_IDT
+#else
+#define CLONE_EXEC 0
+#endif
+
 
 /*
  * List of flags we want to share for kernel threads,
diff --exclude-from=/home/ldb/src/linux-exclude -urNdp linux-2.5.44/kernel/fork.c linux-2.5.44_multiring/kernel/fork.c
--- linux-2.5.44/kernel/fork.c 2002-10-27 02:38:39.000000000 +0100
+++ linux-2.5.44_multiring/kernel/fork.c 2002-10-27 00:44:04.000000000 +0200
@@ -435,7 +435,11 @@ static int copy_mm(unsigned long clone_f
         if (!mm_init(mm))
                 goto fail_nomem;
 
- if (init_new_context(tsk,mm))
+#ifdef init_new_context_flags
+ if (init_new_context_flags(tsk, mm, clone_flags))
+#else
+ if (init_new_context(tsk, mm))
+#endif
                 goto free_pt;
 
         down_write(&oldmm->mmap_sem);



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Oct 31 2002 - 22:00:32 EST