On 6/30/25 11:05, Stefano Garzarella wrote:
On Sun, Jun 29, 2025 at 11:26:12PM +0200, Michal Luczaj wrote:...
On 6/27/25 10:02, Stefano Garzarella wrote:
On Wed, Jun 25, 2025 at 11:23:30PM +0200, Michal Luczaj wrote:
On 6/25/25 10:43, Stefano Garzarella wrote:
On Fri, Jun 20, 2025 at 09:52:43PM +0200, Michal Luczaj wrote:
vsock_find_cid() and vsock_dev_do_ioctl() may race with module unload.
transport_{g2h,h2g} may become NULL after the NULL check.
Introduce vsock_transport_local_cid() to protect from a potential
null-ptr-deref.
KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
RIP: 0010:vsock_find_cid+0x47/0x90
Call Trace:
__vsock_bind+0x4b2/0x720
vsock_bind+0x90/0xe0
__sys_bind+0x14d/0x1e0
__x64_sys_bind+0x6e/0xc0
do_syscall_64+0x92/0x1c0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
KASAN: null-ptr-deref in range [0x0000000000000118-0x000000000000011f]
RIP: 0010:vsock_dev_do_ioctl.isra.0+0x58/0xf0
Call Trace:
__x64_sys_ioctl+0x12d/0x190
do_syscall_64+0x92/0x1c0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
Fixes: c0cfa2d8a788 ("vsock: add multi-transports support")
Suggested-by: Stefano Garzarella <sgarzare@xxxxxxxxxx>
Signed-off-by: Michal Luczaj <mhal@xxxxxxx>
Oh, and come to think of it, we don't really need that (easily contended?)
mutex here. Same can be done with RCU. Which should speed up vsock_bind()
-> __vsock_bind() -> vsock_find_cid(), right? This is what I mean, roughly:
+static u32 vsock_registered_transport_cid(const struct vsock_transport
__rcu **trans_ptr)
+{
+ const struct vsock_transport *transport;
+ u32 cid = VMADDR_CID_ANY;
+
+ rcu_read_lock();
+ transport = rcu_dereference(*trans_ptr);
+ if (transport)
+ cid = transport->get_local_cid();
+ rcu_read_unlock();
+
+ return cid;
+}
...
@@ -2713,6 +2726,7 @@ void vsock_core_unregister(const struct
vsock_transport *t)
transport_local = NULL;
mutex_unlock(&vsock_register_mutex);
+ synchronize_rcu();
}
I've realized I'm throwing multiple unrelated ideas/questions, so let me
summarise:
1. Hackish macro can be used to guard against calling
vsock_registered_transport_cid() on a non-static variable.
2. We can comment the function to add some context and avoid confusion.
I'd go with 2.
All right, will do.
3. Instead of taking mutex in vsock_registered_transport_cid() we can use RCU.
Since the vsock_bind() is not in the hot path, maybe a mutex is fine.
WDYT?
I wrote a benchmark that attempts (and fails due to a non-existing CID) to
bind() 100s of vsocks in multiple threads. `perf lock con` shows that this
mutex is contended, and things are slowed down by 100+% compared with RCU
approach. Which makes sense: every explicit vsock bind() across the whole
system would need to acquire the mutex. And now we're also taking the same
mutex in vsock_assign_transport(), i.e. during connect(). But maybe such
stress testing is just unrealistic, I really don't know.