Re: [PATCH 07/13] [RFC] ipath core misc files

From: Andrew Morton
Date: Sat Dec 17 2005 - 15:38:49 EST

Next message: Andrew Morton: "Re: [PATCH 08/13] [RFC] ipath core last bit"
Previous message: Andrew Morton: "Re: [PATCH 04/13] [RFC] ipath LLD core, part 1"
In reply to: Robert Walsh: "Re: [PATCH 08/13] [RFC] ipath core last bit"
Next in thread: Robert Walsh: "Re: [PATCH 07/13] [RFC] ipath core misc files"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Roland Dreier <rolandd@xxxxxxxxx> wrote:
>
> ...
> +/*
> + * This isn't perfect, but it's close enough for timing work. We want this
> + * to work on systems where the cycle counter isn't the same as the clock
> + * frequency. The one msec spin is OK, since we execute this only once
> + * when first loaded. We don't use CURRENT_TIME because on some systems
> + * it only has jiffy resolution; we just assume udelay is well calibrated
> + * and that we aren't likely to be rescheduled. Do it multiple times,
> + * with a yield in between, to try to make sure we get the "true minimum"
> + * value.
> + * _ipath_pico_per_cycle isn't going to lead to completely accurate
> + * conversions from timestamps to nanoseconds, but it's close enough
> + * for our purposes, which is mainly to allow people to show events with
> + * nsecs or usecs if desired, rather than cycles.
> + */
> +void ipath_init_picotime(void)
> +{
> + int i;
> + u_int64_t ts, te, delta = -1ULL;
> +
> + for (i = 0; i < 5; i++) {
> + ts = get_cycles();
> + udelay(250);
> + te = get_cycles();
> + if ((te - ts) < delta)
> + delta = te - ts;
> + yield();
> + }
> + _ipath_pico_per_cycle = 250000000 / delta;
> +}

hm, I hope this is debug code which is going away. If not, we should take
a look at what it's trying to do here.

> +/*
> + * Our version of the kernel mlock function. This function is no longer
> + * exposed, so we need to do it ourselves. It takes a given start page
> + * (page aligned user virtual address) and pins it and the following specified
> + * number of pages.
> + * For now, num_pages is always 1, but that will probably change at some
> + * point (because caller is doing expected sends on a single virtually
> + * contiguous buffer, so we can do all pages at once).
> + */
> +int ipath_mlock(unsigned long start_page, size_t num_pages, struct page **p)
> +{
> + int n;
> +
> + _IPATH_VDBG("pin %lx pages from vaddr %lx\n", num_pages, start_page);
> + down_read(&current->mm->mmap_sem);
> + n = get_user_pages(current, current->mm, start_page, num_pages, 1, 1,
> + p, NULL);
> + up_read(&current->mm->mmap_sem);
> + if (n != num_pages) {
> + _IPATH_INFO
> + ("get_user_pages (0x%lx pages starting at 0x%lx failed with %d\n",
> + num_pages, start_page, n);
> + if (n < 0) /* it's an errno */
> + return n;
> + return -ENOMEM; /* no way to know actual error */
> + }
> +
> + return 0;
> +}

OK. It's perhaps not a very well named function.

> +/*
> + * this is similar to ipath_mlock, but it's always one page, and we mark
> + * the page as locked for i/o, and shared. This is used for the user process
> + * page that contains the destination address for the rcvhdrq tail update,
> + * so we need to have the vma. If we don't do this, the page can be taken
> + * away from us on fork, even if the child never touches it, and then
> + * the user process never sees the tail register updates.
> + */
> +int ipath_mlock_nocopy(unsigned long start_page, struct page **p)
> +{
> + int n;
> + struct vm_area_struct *vm = NULL;
> +
> + down_read(&current->mm->mmap_sem);
> + n = get_user_pages(current, current->mm, start_page, 1, 1, 1, p, &vm);
> + up_read(&current->mm->mmap_sem);
> + if (n != 1) {
> + _IPATH_INFO("get_user_pages for 0x%lx failed with %d\n",
> + start_page, n);
> + if (n < 0) /* it's an errno */
> + return n;
> + return -ENOMEM; /* no way to know actual error */
> + }
> + vm->vm_flags |= VM_SHM | VM_LOCKED;
> +
> + return 0;
> +}

I don't think we want to be setting the user's VMA's vm_flags in this
manner. This is purely to retain the physical page across fork?

> +/*
> + * Our version of the kernel munlock function. This function is no longer
> + * exposed, so we need to do it ourselves. It unpins the start page
> + * (a page aligned full user virtual address, not a page number)
> + * and pins it and the following specified number of pages.
> + */
> +int ipath_munlock(size_t num_pages, struct page **p)
> +{
> + int i;
> +
> + for (i = 0; i < num_pages; i++) {
> + _IPATH_MMDBG("%u/%lu put_page %p\n", i, num_pages, p[i]);
> + SetPageDirty(p[i]);
> + put_page(p[i]);
> + }
> + return 0;
> +}

Nope, SetPageDirty() doesn't tell the VM that the page is dirty - it'll
never get written out. Use set_page_dirty_lock().

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Andrew Morton: "Re: [PATCH 08/13] [RFC] ipath core last bit"
Previous message: Andrew Morton: "Re: [PATCH 04/13] [RFC] ipath LLD core, part 1"
In reply to: Robert Walsh: "Re: [PATCH 08/13] [RFC] ipath core last bit"
Next in thread: Robert Walsh: "Re: [PATCH 07/13] [RFC] ipath core misc files"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]