Re: [RFC PATCH v8 0/5] IPC: checkpoint/restore in userspace enhancements

From: Stanislav Kinsbursky
Date: Wed Jan 09 2013 - 03:24:26 EST


22.12.2012 19:43, Sasha Levin ÐÐÑÐÑ:
On 12/21/2012 04:57 PM, Sasha Levin wrote:
On 12/21/2012 03:46 PM, Stanislav Kinsbursky wrote:
21.12.2012 00:47, Andrew Morton ÐÐÑÐÑ:
On Thu, 20 Dec 2012 08:06:32 +0400
Stanislav Kinsbursky<skinsbursky@xxxxxxxxxxxxx> wrote:

19.12.2012 00:36, Andrew Morton __________:
On Wed, 24 Oct 2012 19:34:51 +0400
Stanislav Kinsbursky<skinsbursky@xxxxxxxxxxxxx> wrote:

This respin of the patch set was significantly reworked. Most part of new API
was replaced by sysctls (by one per messages, semaphores and shared memory),
allowing to preset desired id for next new IPC object.

This patch set is aimed to provide additional functionality for all IPC
objects, which is required for migration of these objects by user-space
checkpoint/restore utils (CRIU).

The main problem here was impossibility to set up object id. This patch set
solves the problem by adding new sysctls for preset of desired id for new IPC
object.

Another problem was to peek messages from queues without deleting them.
This was achived by introducing of new MSG_COPY flag for sys_msgrcv(). If
MSG_COPY flag is set, then msgtyp is interpreted as message number.
According to my extensive records, Sasha hit a bug in
ipc-message-queue-copy-feature-introduced.patch and Fengguang found a
bug in
ipc-message-queue-copy-feature-introduced-cleanup-do_msgrcv-aroung-msg_copy-feature.patch

It's not obvious (to me) that these things have been identified and
fixed. What's the status, please?
Hello, Andrew.
Fengguang's issue was solved by "ipc: simplify message copying" I sent you.
But I can't find Sasha's issue. As I remember, there was some problem in
early
version of the patch set. But I believe its fixed now.
http://lkml.indiana.edu/hypermail/linux/kernel/1210.3/01710.html

Subject: "ipc, msgqueue: NULL ptr deref in msgrcv"

Ah, yes. Thanks.
Hi found it in initial version of code, which was significantly changed (or cleaned and simplified) by further patch series.
And I cant find out, how this can happen, because this patch he bisect to do not modify the queue itself, while he found the
problem in testmsg.

I actually can't reproduce it on the latest -next.

I was reverting the IPC changes in the past couple of weeks so that I could test the
rest of the IPC code with the fuzzer, and when I added them back in again I can't
reproduce the issue I've reported earlier.

We can probably figure out where it got fixed by bisecting between -next trees if anyone
is interested in that.

Ignore that. It just took more fuzzing to stumble on it again:


Hello, Sasha!
Thanks!
But I still can't understand, how this can happen... And I can't reproduce.
Could you specify your load? I.e. how do you stumble on this panic?
Looks like you don't use new "copy" feature.

[ 103.164594] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[ 103.168159] IP: [<ffffffff81937155>] do_msgrcv+0x205/0x540
[ 103.170031] PGD c7cd067 PUD d274067 PMD 0
[ 103.170031] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 103.170031] Dumping ftrace buffer:
[ 103.170031] (ftrace buffer empty)
[ 103.170031] CPU 4
[ 103.170031] Pid: 7056, comm: trinity Tainted: G W 3.7.0-next-20121221-sasha-00014-g339890c #229
[ 103.170031] RIP: 0010:[<ffffffff81937155>] [<ffffffff81937155>] do_msgrcv+0x205/0x540
[ 103.170031] RSP: 0018:ffff88000c7cfe88 EFLAGS: 00010246
[ 103.170031] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 103.170031] RDX: ffff880013681f00 RSI: 0000000000000124 RDI: ffff8800075a7810
[ 103.170031] RBP: ffff88000c7cff68 R08: 0000000000000000 R09: 0000000000000000
[ 103.170031] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000002
[ 103.170031] R13: ffff8800075a78c0 R14: 7fffffff00000000 R15: ffff8800075a7810
[ 103.170031] FS: 00007ffa529ae700(0000) GS:ffff880013c00000(0000) knlGS:0000000000000000
[ 103.170031] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 103.170031] CR2: 0000000000000010 CR3: 000000000c7cc000 CR4: 00000000000406e0
[ 103.170031] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 103.170031] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 103.170031] Process trinity (pid: 7056, threadinfo ffff88000c7ce000, task ffff88000c020000)
[ 103.170031] Stack:
[ 103.170031] ffff88000c7cfea8 ffff88000c020000 ffff88000c020000 ffff88000c020000
[ 103.170031] 0000000000000000 ffffffff81935e50 0000000000000008 0000000000000000
[ 103.170031] ffffffff858e91e0 0000000000000000 0000000000001001 ffff88000c020000
[ 103.170031] Call Trace:
[ 103.170031] [<ffffffff81935e50>] ? load_msg+0x170/0x170
[ 103.170031] [<ffffffff8107e8c4>] ? syscall_trace_enter+0x24/0x2e0
[ 103.170031] [<ffffffff81184678>] ? trace_hardirqs_on_caller+0x118/0x140
[ 103.170031] [<ffffffff819374a0>] sys_msgrcv+0x10/0x20
[ 103.170031] [<ffffffff83cdf798>] tracesys+0xe1/0xe6
[ 103.170031] Code: 80 f5 ff ff ff 90 41 83 fc 03 74 32 41 83 fc 04 74 0c 41 83 fc 02 75 2c eb 11 0f 1f 40 00 4c 3b 73 10 7d 20
66 90 e9 94 00 00 00 <4c> 39 73 10 0f 85 8a 00 00 00 90 eb 0c 66 0f 1f 44 00 00 4c 39
[ 103.170031] RIP [<ffffffff81937155>] do_msgrcv+0x205/0x540
[ 103.170031] RSP <ffff88000c7cfe88>
[ 103.170031] CR2: 0000000000000010
[ 103.228270] ---[ end trace ddc37199fdad82b0 ]---


Thanks,
Sasha



--
Best regards,
Stanislav Kinsbursky
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/