Re: [patch] 2.4.19-pre10-ac2: O(1) scheduler merge, -A3.

From: William Lee Irwin III (wli@holomorphy.com)
Date: Tue Jun 18 2002 - 02:16:26 EST


On Mon, Jun 17, 2002 at 11:00:26AM +0200, Ingo Molnar wrote:
> irqbalance uses the set_ioapic_affinity() method to set affinity. The
> clustered APIC code is broken if it doesnt handle this properly. (i dont
> have such hardware so i cant tell, but it indeed doesnt appear to handle
> this case properly.) By wrapping around at node boundary the irqbalance
> code will work just fine.

Perhaps a brief look at the code will help. Please forgive my
non-preservation of whitespace as I cut and pasted it.

static inline void balance_irq(int irq)
{
#if CONFIG_SMP
    irq_balance_t *entry = irq_balance + irq;
    unsigned long now = jiffies;

    if (unlikely(entry->timestamp != now)) {
        unsigned long allowed_mask;
        int random_number;

        rdtscl(random_number);
        random_number &= 1;

        allowed_mask = cpu_online_map & irq_affinity[irq];
        entry->timestamp = now;
        entry->cpu = move(entry->cpu, allowed_mask, now, random_number);
        set_ioapic_affinity(irq, 1 << entry->cpu);
    }
#endif
}

        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        1 << entry->cpu

This could be problematic ...

static void set_ioapic_affinity (unsigned int irq, unsigned long mask)
{
    unsigned long flags;

    /*
     * Only the first 8 bits are valid.
     */
    mask = mask << 24;
    spin_lock_irqsave(&ioapic_lock, flags);
    __DO_ACTION(1, = mask, )
    spin_unlock_irqrestore(&ioapic_lock, flags);
}

According to this, nothing over 8 cpu's can work as the cpu id is used
as a shift into an 8-bit bitfield. Also,

#define __DO_ACTION(R, ACTION, FINAL) \
                                                                        \
{ \
        int pin; \
        struct irq_pin_list *entry = irq_2_pin + irq; \
                                                                        \
        for (;;) { \
                unsigned int reg; \
                pin = entry->pin; \
                if (pin == -1) \
                        break; \
                reg = io_apic_read(entry->apic, 0x10 + R + pin*2); \
                reg ACTION; \
                io_apic_modify(entry->apic, reg); \
                if (!entry->next) \
                        break; \
                entry = irq_2_pin + entry->next; \
        } \
        FINAL; \
}

ACTION is supposed to be an assignment to reg; in clustered hierarchical
destination format this is not a bitmask as assumed by 1 << entry->cpu.

Matt, Mike, please comment.

Cheers,
Bill
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Jun 23 2002 - 22:00:15 EST