Re: [PATCH] lib/vsprintf: add %pT[C012] format specifier

From: Kees Cook
Date: Fri Jan 03 2014 - 12:08:31 EST


On Wed, Jan 1, 2014 at 8:49 PM, Tetsuo Handa
<penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
> From: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> Date: Thu, 2 Jan 2014 13:37:14 +0900
> Subject: [PATCH v3] lib/vsprintf: support built-in tokens
>
> The vast majority of ->comm accesses are accessing current->comm, for
> debug reasons. I'm counting 350-odd sites. At all these sites it's
> pointless passing `current' to the printk function at all!
>
> This patch introduces a set of vsprintf tokens for embedding globally visible
> arguments into the format string, so that we can reduce text size a bit (and
> in the future make reading of task_struct->comm consistent) by stop generating
> the code to pass an argument which the callee already has access to, with an
> assumption that currently we are not using the 0x1A byte within the format
> string.
>
> Some examples are shown below.
>
> pr_info("comm=%s\n", current->comm);
> => pr_info("comm=" EMBED_CURRENT_COMM "\n");
>
> pr_info("pid=%d\n", current->pid);
> => pr_info("pid=" EMBED_CURRENT_PID "\n");
>
> pr_info("%s(%d)\n", current->comm, current->pid);
> => pr_info(EMBED_CURRENT_COMM "(" EMBED_CURRENT_PID ")\n");
>
> pr_info("[%-6.6s]\n", current->comm);
> => pr_info(EMBED_FORMAT "-6.6" EMBED_CURRENT_COMM "\n");
>
> Since this patch does not use get_task_comm(), you should not convert from
> get_task_comm(current) to built-in tokens if you want constsistent reading.
>
> Suggested-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> ---
> include/linux/kernel.h | 17 ++++++++
> lib/vsprintf.c | 101 +++++++++++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 116 insertions(+), 2 deletions(-)
> [...]
> diff --git a/lib/vsprintf.c b/lib/vsprintf.c
> index 10909c5..b05217d 100644
> @@ -1423,7 +1512,7 @@ int format_decode(const char *fmt, struct printf_spec *spec)
> spec->type = FORMAT_TYPE_NONE;
>
> for (; *fmt ; ++fmt) {
> - if (*fmt == '%')
> + if (*fmt == '%' || *fmt == 0x1A)
> break;
> }
>

I'm on board with the idea of embedding comm/pid/whatever, but I
either missed or do not understand why a second format-start character
is being added. I think this will complicate audits and maybe trigger
weird info leaks (imagine printing a string that was %-escaped, but
not 0x1A-escaped?)

Why not use % followed by 0x1A to be the start code, instead of just 0x1A?

-Kees

--
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/