Intel® DPC++ Compatibility Tool Developer Guide and Reference
A newer version of this document is available. Customers should click here to go to the newest version.
DPCT1065
Message
Consider replacing sycl::<...>::barrier() with sycl::<...>::barrier(sycl::access::fence_space::local_space) for better performance if there is no access to global memory.
Detailed Help
The function sycl::<...>::barrier() ensures correct memory access ordering in the global and local address space. If the kernel function has no memory accesses in the global memory, it is safe to replace sycl::<...>::barrier() with sycl::<...>::barrier(sycl::access::fence_space::local_space) for better performance.
Suggestions to Fix
Replace sycl::<...>::barrier() with sycl::<...>::barrier(sycl::access::fence_space::local_space).
For example, this original CUDA* code:
struct Data_t {
  float *host_data;
  float *device_data;
};
__global__ void k(Data_t *data) {
  auto tid = threadIdx.x + blockDim.x * blockIdx.x;
  only_read_data(data[tid].device_data);
  __syncthreads();
  only_read_data(data[tid].device_data);
}results in the following migrated SYCL* code:
struct Data_t {
  float *host_data;
  float *device_data;
};
void k(Data_t *data, const sycl::nd_item<3> &item_ct1) {
  auto tid = item_ct1.get_local_id(2) +
             item_ct1.get_local_range(2) * item_ct1.get_group(2);
  only_read_data(data[tid].device_data);
  /*
  DPCT1065:0: Consider replacing sycl::nd_item::barrier() with
  sycl::nd_item::barrier(sycl::access::fence_space::local_space) for better
  performance if there is no access to global memory.
  */
  item_ct1.barrier();
  only_read_data(data[tid].device_data);
}which is rewritten to:
struct Data_t {
  float *host_data;
  float *device_data;
};
void k(Data_t *data, const sycl::nd_item<3> &item_ct1) {
  auto tid = item_ct1.get_local_id(2) +
             item_ct1.get_local_range(2) * item_ct1.get_group(2);
  only_read_data(data[tid].device_data);
  // global_local_space can be replaced with local_space if the access
  // of the global memory after the barrier does not depend on (read-after-write or
  // write-after-read or write-after-write) the access of the same global memory
  // before the barrier among work-items in the current work-group.
  item_ct1.barrier(sycl::access::fence_space::local_space);
  only_read_data(data[tid].device_data);
}