Re: smatch 1.53 released

From: Dan Carpenter
Date: Thu Jun 04 2009 - 15:58:31 EST


On 6/2/09, Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
> Dan Carpenter <error27@xxxxxxxxx> writes:
>
>> Smatch is a source code checker for C. Right now the focus is on checking
>>
>> for kernel bugs.
>
> Could you give a quick overview on what kinds of bugs it looks for
> and where the limitations are?
>

It's pretty good at finding locking bugs. It also checks for double
kfree() bugs, null pointer dereferences and also if you check for null
instead of checking for PTR_ERR(). There is a check for using kfree()
instead of kfree_skb() as well.

The cool thing about smatch is that it's pretty easy to write custom
checks. It's uses sparse as a C parser so you have to look through
expression.h to figure out how to do the pattern matching. There is a
small example script which shows how that works.
http://repo.or.cz/w/smatch.git?a=blob;f=check_template.c

So you use sparse to grep the code for locking functions and then you
use set_state() to set the state to "locked".

If your code looks like this:
lock_kernel();
if (foo) {
unlock_kernel();
} else {
frob();
}
<-- Here the state can be either "locked" or "unlocked".
Calling get_state() here will tell you that it is state &merged. You
can use get_possible() to get a list of possible states it could be.

Say later code looked like this:
if (!foo) {
<-- Here the state is "locked" because of the !foo. Smatch figures
this out automatically, that comes from the "implications" module. Do
other code checkers do this? I'm pretty proud of the feature either
way. :)
}

So basically you grep for locking functions and you set the state
based on that, then you grep for return statements and check that the
state is correct or print an error. Smatch tracks the code paths in
the background and merges states or sets implied states.

Limitations: The big limitation is that smatch only does one pass
through the code so loops aren't handled correctly. Eventually it
will do two passes.

A lot of null dereference false positives come from places where it's
hard to tell if a loop is true at least once.
x = NULL;
while (param--) {
x = &something;
}
x->member;
Someone reading the code probably knows what param is and that it's
non-zero at the start. There is a "--assume-loops" option to make
smatch assume loops go through once.

Many of the locking false positives come from places where the unlock
happens in a seperate function. It should be relatively straight
forward to make a list functions to say that if frob_the_module()
returns -12, or -14 that implies it unlocked a certain lock. I
haven't done this yet.

Otherwise, it's still very young code. Ideally smatch would know the
possible values of every variable in a function but right now many
variables just default to &undefined. The implication code is not as
good as it could be. Also I don't have a good way to build call trees
yet. There is a lot of work to do at every level.

Still, it doesn't hurt to run smatch on your code before submitting a
patch. There is an easy script for this: kchecker /path/to/code.c.
Some of my accepted kernel patches have had bugs which could have been
caught by the current version of smatch...

regards,
dan carpenter

> Thanks,
>
> -Andi
>
> --
> ak@xxxxxxxxxxxxxxx -- Speaking for myself only.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/