This is not really a cilk question but I will try anyway because I think you INTEL must have a lot experience. I have read the section https://software.intel.com/en-us/node/522641 but it is a bit vague.
How do you get reliable timing results when benchmarking cilk programs? Do you use a particular OS in a particular setup for instance? My experience on Windows is times can varies a lot for even for single threaded runs when the same program is run different points times. A 10% difference can easily be measurement error. Our experience seems to indicate Linux is not much better.
Btw I have disabled hyperthreading and is using a server and not a laptop. I am only user and shut down unneeded applications before running my test. Maybe I should compute averages and variance of run times and apply statistical tests to the results. And use that to conclude about performance.