2-nd try.
This pass replaces atomic emulated instructions that writes to the same memory as follows:
Before:
atomic_add(mem, value);
After:
rv = reduce_subgroup(value);
if (sg_local_id == 0)
atomic_add(mem, rv);
Added new optimization for atomic instructions.
This pass replaces atomic operations that writes to the same memory as follows:
Before:
atomic_add(mem, value);
After:
rv = reduce_subgroup(value);
if (sg_local_id == 0)
atomic_add(mem, rv);
This pass replaces atomic operations that writes to the same memory as follows:
Before:
atomic_add(mem, value);
After:
rv = reduce_subgroup(value);
if (sg_local_id == 0)
atomic_add(mem, rv);