Fwd: [PATCH 2/2] [RFC] fadvise: Add _VOLATILE,_ISVOLATILE, and_NONVOLATILE flags
From: Dmitry Adamushko
Date: Sun Feb 12 2012 - 07:54:34 EST
[ resent to lkml in 'plain-text' format ]
On 10 February 2012 01:16, John Stultz <john.stultz@xxxxxxxxxx> wrote:
[ ... ]
> --- /dev/null
> +++ b/mm/volatile.c
> @@ -0,0 +1,314 @@
> +/* mm/volatile.c
> + *
> [ ... ]
>
> +
> +#define range_on_lru(range) (!(range)->purged)
> +
> +
> +static inline void volatile_range_shrink(struct volatile_range *range,
> + pgoff_t start_index, pgoff_t end_index)
> +{
> + size_t pre = range_size(range);
> +
> + range->range_node.start = start_index;
> + range->range_node.end = end_index;
> +
I guess, here we get a whole range of races with volatile_shrink(),
which may see inconsistent (in-the-middle-of-update) ranges (e.g.
.start and .end).
>
> + if (range_on_lru(range)) {
here volatile_shrink() runs and sets range->purge to 1, then calls
__lru_del() => lru_count gets updated.
>
> + mutex_lock(&volatile_lru_mutex);
> + lru_count -= pre - range_size(range);
> + mutex_unlock(&volatile_lru_mutex);
and then lru_count gets updated once more - for the same 'range' object.
>
> + }
> +}
>
> [ ... ]
>
>
> +static int volatile_shrink(struct shrinker *ignored, struct shrink_control *sc)
> +{
> + struct volatile_range *range, *next;
> + unsigned long nr_to_scan = sc->nr_to_scan;
> + const gfp_t gfp_mask = sc->gfp_mask;
> +
> + /* We might recurse into filesystem code, so bail out if necessary */
> + if (nr_to_scan && !(gfp_mask & __GFP_FS))
> + return -1;
> + if (!nr_to_scan)
> + return lru_count;
So it's u64 -> int here, which is possibly 32 bits and signed. Can't
it lead to inconsistent results on 32bit platforms?
>
> +
> + mutex_lock(&volatile_lru_mutex);
> + list_for_each_entry_safe(range, next, &volatile_lru_list, lru) {
> + struct inode *inode = range->mapping->host;
> + loff_t start, end;
> +
> +
> + start = range->range_node.start * PAGE_SIZE;
> + end = (range->range_node.end + 1) * PAGE_SIZE - 1;
PAGE_CACHE_SHIFT was used in fadvise() to calculate .start and .end
indexes, and here we use PAGE_SIZE to get back to 'normal' addresses.
Isn't it inconsistent at the very least?
>
> +
> + /*
> + * XXX - calling vmtruncate_range from a shrinker causes
> + * lockdep warnings. Revisit this!
> + */
> + vmtruncate_range(inode, start, end);
> + range->purged = 1;
> + __lru_del(range);
> +
> + nr_to_scan -= range_size(range);
hmm, unsigned long -= u64
>
> + if (nr_to_scan <= 0)
nr_to_scan is "unsigned long" :-))
[ ... ]
> +arch_initcall(volatile_init);
> --
> 1.7.3.2.146.gca209
>
--
-- Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/