Re: [PATCH for 3.2] memcg: do not trap chargers with full callstack on OOM

From: azurIt
Date: Mon Jun 17 2013 - 06:21:44 EST


>Here we go. I hope I didn't screw anything (Johannes might double check)
>because there were quite some changes in the area since 3.2. Nothing
>earth shattering though. Please note that I have only compile tested
>this. Also make sure you remove the previous patches you have from me.


Hi Michal,

it, unfortunately, didn't work. Everything was working fine but original problem is still occuring. I'm unable to send you stacks or more info because problem is taking down the whole server for some time now (don't know what exactly caused it to start happening, maybe newer versions of 3.2.x). But i'm sure of one thing - when problem occurs, nothing is able to access hard drives (every process which tries it is freezed until problem is resolved or server is rebooted). Problem is fixed after killing processes from cgroup which caused it and everything immediatelly starts to work normally. I find this out by keeping terminal opened from another server to one where my problem is occuring quite often and running several apps there (htop, iotop, etc.). When problem occurs, all apps which wasn't working with HDD was ok. The htop proved to be very usefull here because it's only reading proc filesystem and is also able to send KILL signals - i was able to resolve the problem with it
without rebooting the server.

I created a special daemon (about month ago) which is able to detect and fix the problem so i'm not having server outages now. The point was to NOT access anything which is stored on HDDs, the daemon is only reading info from cgroup filesystem and sending KILL signals to processes. Maybe i should be able to also read stack files before killing, i will try it.

Btw, which vanilla kernel includes this patch?

Thank you and everyone involved very much for time and help.

azur
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/