Re: [PATCH 0/8] tracing: introducing eventfs

From: Ajay Kaher
Date: Sun Jan 29 2023 - 13:07:44 EST




> On 23-Jan-2023, at 10:21 PM, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> !! External Email
>
> On Sun, 22 Jan 2023 22:37:08 +0530
> Ajay Kaher <akaher@xxxxxxxxxx> wrote:
>
>> Events Tracing infrastructure contains lot of files, directories
>> (internally in terms of inodes, dentries). And ends up by consuming
>> memory in MBs. We can have multiple events of Events Tracing, which
>> further requires more memory.
>>
>> Instead of creating inodes/dentries, eventfs could keep meta-data and
>> skip the creation of inodes/dentries. As and when require, eventfs will
>> create the inodes/dentries only for required files/directories.
>> Also eventfs would delete the inodes/dentries once no more requires
>> but preserve the meta data.
>>
>> Tracing events took ~9MB, with this approach it took ~4.5MB
>> for ~10K files/dir.
>>
>> [PATCH 1/8]: Introducing struct tracefs_inode
>> [PATCH 2/8]: Adding eventfs-dir-add functions
>> [PATCH 3/8]: Adding eventfs-file-add function
>> [PATCH 4/8]: Adding eventfs-file-directory-remove function
>> [PATCH 5/8]: Adding functions to create-eventfs-files
>> [PATCH 6/8]: Adding eventfs lookup, read, open functions
>> [PATCH 7/8]: Creating tracefs_inode_cache
>> [PATCH 8/8]: Moving tracing events to eventfs
>
> Hi Ajay,
>
> Thanks a lot for sending these out.
>
> Note, something went wrong with your threading, as all the patches should
> be a reply to this one, but instead, they all (including this email) are a
> reply to patch 1 ??

Not sure why this is happenning, but I will try to fix in v2.

> Also, for v2, can you address all the kernel test robot issues as well as
> what Dan Carpenter wrote. There's also a couple of whitespace issues.
>

Sure, in v2.

> Finally, when I run the ftrace selftests that are in the kernel repository:
>
> # cd linux.git
> # cd tools/testing/selftests/ftrace
> # ./ftracetests

Thanks. I was looking some utility to test eventfs.

> It crashes with a NULL kernel dereference:
>
> [ 1021.844973] general protection fault, probably for non-canonical address 0x626f7270747365a6: 0000 [#1] PREEMPT SMP PTI
> [ 1021.848900] CPU: 2 PID: 1160 Comm: ftracetest Not tainted 6.2.0-rc3-test-00014-g1a351602422d #152
> [ 1021.852384] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-debian-1.16.0-5 04/01/2014
> [ 1021.855716] RIP: 0010:dcache_dir_open_wrapper+0x6b/0x1b0
> [ 1021.857700] Code: 75 28 e9 f7 00 00 00 48 8b 7b 10 48 85 ff 74 09 48 83 c7 58 e8 36 ad 0c 00 48 8b 43 18 48 8d 58 e8 49 39 c4 0f 84 d4 00 00 00 <80> 7b 4a 00 75 d7 c6 43 4a 01 48 8b 45 30 48 8d b8 a0 00 00 00 e8
> [ 1021.864170] RSP: 0018:ffffa68b40f0fcb0 EFLAGS: 00010296
> [ 1021.866133] RAX: 626f727074736574 RBX: 626f72707473655c RCX: ffff9c6bc08cb000
> [ 1021.868797] RDX: ffffffff89058dc0 RSI: ffff9c6bc09f6f00 RDI: ffff9c6bceef2810
> [ 1021.871389] RBP: ffff9c6bcee223c0 R08: ffffffff8a3b2da0 R09: ffff9c6bceef2810
> [ 1021.873953] R10: 0000000000000007 R11: 0000000000000002 R12: ffff9c6bc3664980
> [ 1021.876669] R13: ffff9c6bc09f6f00 R14: ffff9c6bceef2810 R15: ffff9c6bc09f6f00
> [ 1021.880350] FS: 00007f58e39ba740(0000) GS:ffff9c6d37c80000(0000) knlGS:0000000000000000
> [ 1021.883289] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1021.885401] CR2: 000055a0c1465000 CR3: 000000010f5a8003 CR4: 0000000000170ee0
> [ 1021.888117] Call Trace:
> [ 1021.889227] <TASK>
> [ 1021.890216] ? __pfx_dcache_dir_open_wrapper+0x10/0x10
> [ 1021.892088] do_dentry_open+0x1e5/0x410
> [ 1021.893501] path_openat+0xd7f/0x1220
> [ 1021.894863] ? asm_exc_page_fault+0x22/0x30
> [ 1021.896325] ? trace_hardirqs_on+0x2a/0xe0
> [ 1021.897715] do_filp_open+0xaf/0x160
> [ 1021.898972] do_sys_openat2+0xaf/0x170
> [ 1021.900211] __x64_sys_openat+0x6a/0xa0
> [ 1021.901451] do_syscall_64+0x3a/0x90
> [ 1021.902636] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> [ 1021.904145] RIP: 0033:0x7f58e3ab9e41
> [ 1021.905255] Code: 44 24 18 31 c0 41 83 e2 40 75 3e 89 f0 25 00 00 41 00 3d 00 00 41 00 74 30 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 77 3f 48 8b 54 24 18 64 48 2b 14 25 28 00 00 00
> [ 1021.910033] RSP: 002b:00007ffdfae49b40 EFLAGS: 00000287 ORIG_RAX: 0000000000000101
> [ 1021.913838] RAX: ffffffffffffffda RBX: 000055a0c145bfb1 RCX: 00007f58e3ab9e41
> [ 1021.915613] RDX: 0000000000090800 RSI: 000055a0c1463380 RDI: 00000000ffffff9c
> [ 1021.917358] RBP: 000055a0c146338f R08: 0000000000000001 R09: 000000000000000f
> [ 1021.919110] R10: 0000000000000000 R11: 0000000000000287 R12: 000055a0c1463380
> [ 1021.920864] R13: 000055a0c145bfb1 R14: 0000000000000000 R15: 000055a0c145bfb2
> [ 1021.922613] </TASK>
> [ 1021.923331] Modules linked in: vmw_vsock_virtio_transport vmw_vsock_virtio_transport_common vsock ip_tables
> [ 1021.925741] Dumping ftrace buffer:
> [ 1021.926709] (ftrace buffer empty)
> [ 1021.927754] ---[ end trace 0000000000000000 ]---
> [ 1021.928993] RIP: 0010:dcache_dir_open_wrapper+0x6b/0x1b0
>
> Could you see what happened there?

For some cases, eventfs keeps file/folder (in form of node/dentry) with-in VFS
even last dput has been called. This will be deleted while executing drop_cache()
and eventfs_set_ef_status_free() will be called.

Above GPF happens when something deleted from eventfs link-list (removal of
dynamic events) but present with-in VFS (because of earlier access) and when
VFS will try to access and call dcache_dir_open_wrapper().

Solution: while deleting from eventfs link-list, needs to detect and delete from VFS.

I will fix in v2. Thanks for reporting this bug.

-Ajay