Re: Help: vfs problem?

From: Russell King (rmk@arm.linux.org.uk)
Date: Sun May 21 2000 - 13:03:59 EST


Alan Cox writes:
> One possible thing to try on the debugging side is at the 'official'
> end of the DMA turn off the master bit for the IDE controller, do an
> arbitary pci read (to flush anything pending) then do a check

Ok, I've made the following changes in ide-dma.c (still running without
caches):

ide_build_dmatable():
...
        HWIF(drive)->sg_nents = i = ide_build_sglist(HWIF(drive), HWGROUP(drive)->rq);

check_free_lists(1);
/* turn on master */
pci_write_config_word(PCIDEV(HWIF(drive)), PCI_COMMAND, PCI_COMMAND_MASTER | PCI_COMMAND_IO);
        sg = HWIF(drive)->sg_table;
...

ide_destroy_dmatable():
...
/* turn off master bit */
pci_write_config_word(dev, PCI_COMMAND, PCI_COMMAND_IO);
inb(HWIF(drive)->dma_base+2);
check_free_lists(3);
}

The following is the code in my __remove_from_free_list() in fs/buffer.c:

static void __remove_from_free_list(struct buffer_head * bh, int index)
{
 if (!bh->b_prev_free || !bh->b_next_free) {
  printk("Corrupted bh @ %p idx %d: next %p, prev %p, count %d\n", bh, index,
        bh->b_next_free, bh->b_prev_free, atomic_read(&bh->b_count));
 }
        if(bh->b_next_free == bh)
...

Therefore, except when we expect a BM-DMA transfer to occur, we should
not be getting any PCI accesses to the SDRAM. Also, before the
check_free_lists(3) is run, we can be certain that all the data in the
PCI fifos is flushed to SDRAM.

The first instance of corruption is in ide_build_dmatable(). My debug
output contains:

check_free_lists1: sz 1: bad bh@c1c22240: next c1c47ec0, prev c1c222a0 dev 0000ffff
     next->prev = 00000000
     prev->next = c1c22240
check_free_lists3: sz 1: bad bh@c1c22240: next c1c47ec0, prev c1c222a0 dev 0000ffff
     next->prev = 00000000
     prev->next = c1c22240

This repeats many many times, and eventually we hit the time when
the bh at 0xc1c47ec0 is the next bh to be taken off the free list:

check_free_lists1: sz 1: bad bh@c1c47ec0: next 00000000, prev c0e1de40 dev 00000000
     prev->next = c1c47ec0
check_free_lists3: sz 1: bad bh@c1c47ec0: next 00000000, prev c0e1de40 dev 00000000
     prev->next = c1c47ec0
Corrupted bh @ c1c47ec0 idx 1: next 00000000, prev c0e1de40, count 0
<1>Unable to handle kernel NULL pointer dereference at virtual address 00000024

the oops in __remove_from_free_list.

So, we appear to have a bh on the free list with a NULL next free pointer, but a
completely correct prev free pointer, and with a dev field of zero, rather than
B_FREE.

When a buffer is on the free list, I assume that the b_blocknr and b_dev_id
elements are never used? I'd like to use these to make "notes" into the bh
struct about where the bh was placed onto the queue from. (I'll probably
have tried it before anyone gets around to replying...)
   _____
  |_____| ------------------------------------------------- ---+---+-
  | | Russell King rmk@arm.linux.org.uk --- ---
  | | | | http://www.arm.linux.org.uk/~rmk/aboutme.html / / |
  | +-+-+ --- -+-
  / | THE developer of ARM Linux |+| /|\
 / | | | --- |
    +-+-+ ------------------------------------------------- /\\\ |

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue May 23 2000 - 21:00:20 EST