Hi,
I wonder how reducers work internally. So if a value is set into a reducer, does it block other threads each time a value is set?
I ask because normally I'm creating a local 'reducer', e.g. a local histogram on an image tile and on leaving the thread all the data is pushed at once into the global reducer. Just like local memory operations in OpenCL.