Re: [PATCH v2] swap: choose swap device according to numa node

From: Andrew Morton
Date: Thu Aug 17 2017 - 18:44:23 EST


On Wed, 16 Aug 2017 10:44:40 +0800 Aaron Lu <aaron.lu@xxxxxxxxx> wrote:

>
> If the system has more than one swap device and swap device has the node
> information, we can make use of this information to decide which swap
> device to use in get_swap_pages() to get better performance.
>
> The current code uses a priority based list, swap_avail_list, to decide
> which swap device to use and if multiple swap devices share the same
> priority, they are used round robin. This patch changes the previous
> single global swap_avail_list into a per-numa-node list, i.e. for each
> numa node, it sees its own priority based list of available swap devices.
> Swap device's priority can be promoted on its matching node's
> swap_avail_list.
>
> The current swap device's priority is set as: user can set a >=0 value, or
> the system will pick one starting from -1 then downwards. The priority
> value in the swap_avail_list is the negated value of the swap device's due
> to plist being sorted from low to high. The new policy doesn't change the
> semantics for priority >=0 cases, the previous starting from -1 then
> downwards now becomes starting from -2 then downwards and -1 is reserved
> as the promoted value.
>
> ...
>
> +static int __init swapfile_init(void)
> +{
> + int nid;
> +
> + swap_avail_heads = kmalloc(nr_node_ids * sizeof(struct plist_head), GFP_KERNEL);

I suppose we should use kmalloc_array(), as someone wrote it for us.

--- a/mm/swapfile.c~swap-choose-swap-device-according-to-numa-node-v2-fix
+++ a/mm/swapfile.c
@@ -3700,7 +3700,8 @@ static int __init swapfile_init(void)
{
int nid;

- swap_avail_heads = kmalloc(nr_node_ids * sizeof(struct plist_head), GFP_KERNEL);
+ swap_avail_heads = kmalloc_array(nr_node_ids, sizeof(struct plist_head),
+ GFP_KERNEL);
if (!swap_avail_heads) {
pr_emerg("Not enough memory for swap heads, swap is disabled\n");
return -ENOMEM;

> + if (!swap_avail_heads) {
> + pr_emerg("Not enough memory for swap heads, swap is disabled\n");

checkpatch tells us that the "Not enough memory" is a bit redundant, as
the memory allocator would have already warned. So it's sufficient to
additionally say only "swap is disabled" here. But it's hardly worth
changing.

> + return -ENOMEM;
> + }
> +
> + for_each_node(nid)
> + plist_head_init(&swap_avail_heads[nid]);
> +
> + return 0;
> +}
> +subsys_initcall(swapfile_init);