Quantcast
Channel: Intel® Cilk™ Plus
Viewing all articles
Browse latest Browse all 77

Why does the available number of workers changes execution for a 1 cilk_spawn program?

$
0
0

While optimizing a matrix manipulation code in C, I used CilkPlus to spawn a thread to execute in parallel two functions that are data independent and somewhat computationally intensive. Cilk_spawn is used in only one place in the code as follows:

//(test_function declarations)

cilk_spawn highPrep(d, x, half);

d = temp_0;
r = malloc(sizeof(int)*(half));
temp_1 = r;
x = x_alloc + F_EXTPAD;
lowPrep(r, d, x, half);

cilk_sync;

//test_function return

According to the documentation I have read so far, cilk_spawn is expected to -maybe since CilkPlus does not enforce parallelism- take the highPrep() function and execute it in a different hardware thread if one is available. At the same time it will continue executing the rest of the code including the function lowPrep() until the cilk_sync is reached. At that point the threads sync before the execution proceeds.

The tests are ran on a Xeon E5-2680, dedicated for these experiments. When I change the environment variable CILK_NWORKERS and try values such as 2, 4, 8, 16 the time that the test_function requires to be executed increases as the number of available workers grows larger than 2.

I would expect the available number of threads not to change anything in the execution of this code. I would expect that if 2 threads are available then the function highPrep is executed a thread different than the main. Any thread after that I would expected to remain idle.

Could anyone help in understanding what is going wrong here? 

Thank you in advance.


Viewing all articles
Browse latest Browse all 77

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>