Intel® FPGA SDK for OpenCL™ Pro Edition: Programming Guide

ID 683846
Date 6/21/2022
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

5.9.1. Matching Data Layouts of Host and Kernel Structure Data Types

If you use structure data types (struct) as arguments in OpenCL™ kernels, match the member data types and align the data members between the host application and the kernel code.

To match member data types, use the cl_ version of the data type in your host application that corresponds to the data type in the kernel code. The cl_ version of the data type is available in the opencl.h header file. For example, if you have a data member of type float4 in your kernel code, the corresponding data member you declare in the host application is cl_float4.

Align the structures and align the struct data members between the host and kernel applications. Manage the alignments carefully because of the variability among different host compilers.

For example, if you have float4 OpenCL data types in the struct, the alignments of these data items must satisfy the OpenCL specification (that is, 16-byte alignment for float4).

The following rules apply when the Intel® FPGA SDK for OpenCL™ Offline Compiler compiles your OpenCL kernels:

  1. Alignment of built-in scalar and vector types follow the rules outlined in Section 6.1.5 of the OpenCL Specification version 1.0.

    The offline compiler usually aligns a data type based on its size. However, the compiler aligns a value of a three-element vector the same way it aligns a four-element vector.

  2. An array has the same alignment as one of its elements.
  3. A struct (or a union) has the same alignment as the maximum alignment necessary for any of its data members.

    Consider the following example:

    struct my_struct
    {
        char data[3];
        float4 f4;
        int index;
    };

    The offline compiler aligns the struct elements above at 16-byte boundaries because of the float4 data type. As a result, both data and index also have 16-byte alignment boundaries.

  4. The offline compiler does not reorder data members of a struct.
  5. Normally, the offline compiler inserts a minimum amount of data structure padding between data members of a struct to satisfy the alignment requirements for each data member.
    1. In your OpenCL kernel code, you may specify data packing (that is, no insertion of data structure padding) by applying the packed attribute to the struct declaration. If you impose data packing, ensure that the alignment of data members satisfies the OpenCL alignment requirements. The Intel® FPGA SDK for OpenCL™ does not enforce these alignment requirements. Ensure that your host compiler respects the kernel attribute and sets the appropriate alignments.
    2. In your OpenCL kernel code, you may specify the amount of data structure padding by applying the aligned(N) attribute to a data member, where N is the amount of padding. The SDK does not enforce these alignment requirements. Ensure that your host compiler respects the kernel attribute and sets the appropriate alignments.

      For Windows systems, some versions of the Microsoft Visual Studio compiler pack structure data types by default. If you do not want to apply data packing, specify an amount of data structure padding as shown below:

      struct my_struct
      {
          __declspec(align(16)) char data[3];
      
          /*Note that cl_float4 is the only known float4 definition on the host*/
          __declspec(align(16)) cl_float4 f4;
          
          __declspec(align(16)) int index;
      };
      
      Tip: An alternative way of adding data structure padding is to insert dummy struct members of type char or array of char.