Re: [PATCH 04/33] Hold threads

From: Pavel Emelyanov
Date: Thu Mar 20 2014 - 15:01:53 EST


On 03/20/2014 01:39 PM, Janani Venkataraman wrote:
> Getting number of threads and their respective IDs through /proc/pid/stat and
> /proc/pid/task.
>
> The threads are then seized and interrupted. After the dump is taken they are
> detached.
>
> Signed-off-by: Janani Venkataraman <jananive@xxxxxxxxxxxxxxxxxx>
> ---

> +/* Gets the Thread IDS and siezes them */
> +int seize_threads(int pid)
> +{
> + char filename[40];
> + DIR *dir;
> + int ct = 0, ret = 0, tmp_tid;
> + struct dirent *entry;
> + char state;
> +
> + ret = get_thread_count(pid);
> + if (ret == -1)
> + return -1;
> +
> + cp.thread_count = ret;
> + cp.t_id = calloc(cp.thread_count, sizeof(int));
> + if (!cp.t_id) {
> + status = errno;
> + gencore_log("Could not allocate memory for thread_ids.\n");
> + return -1;
> + }
> +
> + snprintf(filename, 40, "/proc/%d/task", pid);
> + dir = opendir(filename);
> +
> + while ((entry = readdir(dir))) {

This simple loop is not enough -- threads may appear and disappear while
you do the readdir and seize, so you should scan it several times to
make sure you caught all the threads.

You can look at how this is done in CRIU in cr-dump,c:collect_threads().

> + if (entry->d_type == DT_DIR && entry->d_name[0] != '.') {
> + tmp_tid = atoi(entry->d_name);
> + ret = ptrace(PTRACE_SEIZE, tmp_tid, 0, 0);
> + if (ret) {
> + state = get_thread_status(tmp_tid);
> + if (state == 'Z')
> + goto assign;
> + status = errno;
> + gencore_log("Could not seize thread: %d\n",
> + tmp_tid);
> + break;
> + }
> + ret = ptrace(PTRACE_INTERRUPT, tmp_tid, 0, 0);
> + if (ret) {
> + state = get_thread_status(tmp_tid);
> + if (state == 'Z')
> + goto assign;
> + status = errno;
> + gencore_log("Could not interrupt thread: %d\n",
> + tmp_tid);
> + break;
> + }
> +assign:
> + /* If a new thread, is created after we fetch the thread_count,
> + * we may encounter a buffer overflow situation in the cp_tid.
> + * Hence we check this case and re-allocate memory if required.
> + */
> + cp.t_id[ct++] = tmp_tid;
> + }
> + }
> +
> + /* Reassigning based on successful seizes */
> + cp.thread_count = ct;
> +
> + closedir(dir);
> +
> + /* Successful seize and interrupt on all threads makes ret = 0 */
> + return ret;
> +}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/