Re: [PATCH net 1/3] vsock: Fix transport_{h2g,g2h} TOCTOU

From: Stefano Garzarella
Date: Fri Jun 20 2025 - 09:21:15 EST


On Fri, Jun 20, 2025 at 02:58:49PM +0200, Michal Luczaj wrote:
On 6/20/25 10:32, Stefano Garzarella wrote:
On Wed, Jun 18, 2025 at 02:34:00PM +0200, Michal Luczaj wrote:
Checking transport_{h2g,g2h} != NULL may race with vsock_core_unregister().
Make sure pointers remain valid.

KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
RIP: 0010:vsock_dev_do_ioctl.isra.0+0x58/0xf0
Call Trace:
__x64_sys_ioctl+0x12d/0x190
do_syscall_64+0x92/0x1c0
entry_SYSCALL_64_after_hwframe+0x4b/0x53

Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
Signed-off-by: Michal Luczaj <mhal@xxxxxxx>
---
net/vmw_vsock/af_vsock.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 2e7a3034e965db30b6ee295370d866e6d8b1c341..047d1bc773fab9c315a6ccd383a451fa11fb703e 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -2541,6 +2541,8 @@ static long vsock_dev_do_ioctl(struct file *filp,

switch (cmd) {
case IOCTL_VM_SOCKETS_GET_LOCAL_CID:
+ mutex_lock(&vsock_register_mutex);
+
/* To be compatible with the VMCI behavior, we prioritize the
* guest CID instead of well-know host CID (VMADDR_CID_HOST).
*/
@@ -2549,6 +2551,8 @@ static long vsock_dev_do_ioctl(struct file *filp,
else if (transport_h2g)
cid = transport_h2g->get_local_cid();

+ mutex_unlock(&vsock_register_mutex);


What about if we introduce a new `vsock_get_local_cid`:

u32 vsock_get_local_cid() {
u32 cid = VMADDR_CID_ANY;

mutex_lock(&vsock_register_mutex);
/* To be compatible with the VMCI behavior, we prioritize the
* guest CID instead of well-know host CID (VMADDR_CID_HOST).
*/
if (transport_g2h)
cid = transport_g2h->get_local_cid();
else if (transport_h2g)
cid = transport_h2g->get_local_cid();
mutex_lock(&vsock_register_mutex);

return cid;
}


And we use it here, and in the place fixed by next patch?

I think we can fix all in a single patch, the problem here is to call
transport_*->get_local_cid() without the lock IIUC.

Do you mean:

bool vsock_find_cid(unsigned int cid)
{
- if (transport_g2h && cid == transport_g2h->get_local_cid())
+ if (transport_g2h && cid == vsock_get_local_cid())
return true;

?

Nope, I meant:

bool vsock_find_cid(unsigned int cid)
{
- if (transport_g2h && cid == transport_g2h->get_local_cid())
- return true;
-
- if (transport_h2g && cid == VMADDR_CID_HOST)
+ if (cid == vsock_get_local_cid())
return true;

if (transport_local && cid == VMADDR_CID_LOCAL)

But now I'm thinking if we should also include `transport_local` in the new `vsock_get_local_cid()`.

I think that will fix an issue when calling IOCTL_VM_SOCKETS_GET_LOCAL_CID and only vsock-loopback kernel module is loaded, so maybe we can do 2 patches:

1. fix IOCTL_VM_SOCKETS_GET_LOCAL_CID to check also `transport_local`
Fixes: 0e12190578d0 ("vsock: add local transport support in the vsock core")

2. move that code in vsock_get_local_cid() with proper locking and use it also in vsock_find_cid()

WDYT?

Thanks,
Stefano