RE: [PATCH v9 net-next 1/2] hv_sock: introduce Hyper-V Sockets

From: Dexuan Cui
Date: Sun May 08 2016 - 02:11:28 EST


> From: David Miller [mailto:davem@xxxxxxxxxxxxx]
> Sent: Sunday, May 8, 2016 1:41
> To: Dexuan Cui <decui@xxxxxxxxxxxxx>
> Cc: gregkh@xxxxxxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; devel@xxxxxxxxxxxxxxxxxxxxxx; olaf@xxxxxxxxx;
> apw@xxxxxxxxxxxxx; jasowang@xxxxxxxxxx; cavery@xxxxxxxxxx; KY
> Srinivasan <kys@xxxxxxxxxxxxx>; Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>;
> joe@xxxxxxxxxxx; vkuznets@xxxxxxxxxx
> Subject: Re: [PATCH v9 net-next 1/2] hv_sock: introduce Hyper-V Sockets
>
> From: Dexuan Cui <decui@xxxxxxxxxxxxx>
> Date: Sat, 7 May 2016 10:49:25 +0000
>
> > I should be able to make 'send', 'recv' here to pointers and use vmalloc()
> > to allocate the memory for them. I will do this.
>
> That's still unswappable kernel memory.
Hi David,
My understanding is: kernel pages are not swappable in Linux, so it looks I
can't avoid unswappable kernel memory here?

> People can open N sockets, where N is something on the order of the FD
> limit the process has, per process. This allows someone to quickly
> eat up a lot of memory and hold onto it nearly indefinitely.

Thanks for pointing this out!
I understand, so I think I should add a module parameter, e.g.,
"hv_sock.max_socket_number" with a default value, say, 1024?

1 established hv_sock connection takes less than 20 pages, including 10
pages for VMBus ringbuffer, 6 pages for send/recv buffers(I'll use
vmalloc() for this), etc.
Here the recv buf needs a size of 5 pages because potentially the host
can send the guest a VMBus packet with an up-to-5-page payload, i..e,
the VMBus inbound ringbuffer size.

1024 hv_sock connections take less than 20*4KB * 1K = 80MB memory.

A user who needs more connections can change the module parameter
without reboot.

hv_sock connection is designed to work only between the host and the
guest. I think 1024 connections seem pretty enough.

BTW, a user can't create hv_sock connections without enough privilege.
Please see

+static int hvsock_create(struct net *net, struct socket *sock,
+ int protocol, int kern)
+{
+ if (!capable(CAP_SYS_ADMIN) && !capable(CAP_NET_ADMIN))
+ return -EPERM;

David, does this make sense to you?

Thanks,
-- Dexuan