Visible to Intel only — GUID: GUID-8F490F24-0EC6-43AF-8A7F-6C0DA66EAE98
DPCT1000
DPCT1001
DPCT1002
DPCT1003
DPCT1004
DPCT1005
DPCT1006
DPCT1007
DPCT1008
DPCT1009
DPCT1010
DPCT1011
DPCT1012
DPCT1013
DPCT1014
DPCT1015
DPCT1016
DPCT1017
DPCT1018
DPCT1019
DPCT1020
DPCT1021
DPCT1022
DPCT1023
DPCT1024
DPCT1025
DPCT1026
DPCT1027
DPCT1028
DPCT1029
DPCT1030
DPCT1031
DPCT1032
DPCT1033
DPCT1034
DPCT1035
DPCT1036
DPCT1037
DPCT1038
DPCT1039
DPCT1040
DPCT1041
DPCT1042
DPCT1043
DPCT1044
DPCT1045
DPCT1046
DPCT1047
DPCT1048
DPCT1049
DPCT1050
DPCT1051
DPCT1052
DPCT1053
DPCT1054
DPCT1055
DPCT1056
DPCT1057
DPCT1058
DPCT1059
DPCT1060
DPCT1061
DPCT1062
DPCT1063
DPCT1064
DPCT1065
DPCT1066
DPCT1067
DPCT1068
DPCT1069
DPCT1070
DPCT1071
DPCT1072
DPCT1073
DPCT1074
DPCT1075
DPCT1076
DPCT1077
DPCT1078
DPCT1079
DPCT1080
DPCT1081
DPCT1082
DPCT1083
DPCT1084
DPCT1085
DPCT1086
Message
Detailed Help
Suggestions to Fix
DPCT1087
DPCT1088
DPCT1089
DPCT1090
DPCT1091
DPCT1092
DPCT1093
DPCT1094
DPCT1095
DPCT1096
DPCT1097
DPCT1098
DPCT1099
DPCT1100
DPCT1101
DPCT1102
DPCT1103
DPCT1104
DPCT1106
Visible to Intel only — GUID: GUID-8F490F24-0EC6-43AF-8A7F-6C0DA66EAE98
DPCT1086
Message
__activemask() is migrated to 0xffffffff. You may need to adjust the code.
Detailed Help
There is currently no functional equivalent of __activemask() in SYCL*. If there is flow control in your code that will make the thread inactive, you need to rewrite the thread logic.
For example, this original CUDA* code:
__device__ inline int SHFL_SYNC(unsigned mask, int val, unsigned offset, unsigned w = warpSize) { return __shfl_down_sync(mask, val, offset, w); } __global__ void kernel(int *array) { unsigned int tid = threadIdx.x; if (tid >= 8) return; unsigned mask = __activemask(); array[tid] = SHFL_SYNC(mask, array[tid], 4); }
results in the following migrated SYCL code:
inline int SHFL_SYNC(unsigned mask, int val, unsigned offset, sycl::nd_item<3> item_ct1, unsigned w = 0) { if (!w) w = item_ct1.get_sub_group().get_local_range().get(0); // This call will wait for all work-items to arrive which will never happen since only work-items with tid < 8 will encounter this call. return sycl::shift_group_left(item_ct1.get_sub_group(), val, offset); } void kernel(int *array, sycl::nd_item<3> item_ct1) { unsigned int tid = item_ct1.get_local_id(2); if (tid >= 8) return; /* DPCT1086 */ unsigned mask = 0xffffffff; array[tid] = SHFL_SYNC(mask, array[tid], 4, item_ct1); }
which is rewritten to:
// remove mask parameter, as it is not used inline int SHFL_SYNC(int val, unsigned offset, sycl::nd_item<3> item_ct1, unsigned w = 0) { if (!w) w = item_ct1.get_sub_group().get_local_range().get(0); unsigned int tid = item_ct1.get_local_id(2); // Use a temporary variable to save the result of sycl::shift_group_left() to make sure all work-items can encounter this call. int v_tmp = sycl::shift_group_left(item_ct1.get_sub_group(), val, offset); return (tid < 8) ? v_tmp : val; } void kernel(int *array, sycl::nd_item<3> item_ct1) { unsigned int tid = item_ct1.get_local_id(2); // remove mask parameter, as it is not used array[tid] = SHFL_SYNC(array[tid], 4, item_ct1); }
Suggestions to Fix
Check if 0xffffffff can be used instead of __activemask(). If it cannot be used, redesign the thread logic.