Visible to Intel only — GUID: GUID-F551B918-87BF-48F3-8A38-4B07165E297E
Why is FPGA Compilation Different?
Types of SYCL* FPGA Compilation
FPGA Compilation Flags
Emulate and Debug Your Design
Evaluate Your Kernel Through Simulation
Device Selectors for FPGA
FPGA IP Authoring Flow
Fast Recompile for FPGA
Generate Multiple FPGA Images (Linux only)
FPGA BSPs and Boards
Targeting Multiple Homogeneous FPGA Devices
Targeting Multiple Platforms
FPGA-CPU Interaction
FPGA Performance Optimization
Use of RTL Libraries for FPGA
Use SYCL Shared Library With Third-Party Applications
FPGA Workflows in IDEs
Intel oneAPI DPC++ Library (oneDPL)
Intel oneAPI Math Kernel Library (oneMKL)
Intel oneAPI Threading Building Blocks (oneTBB)
Intel oneAPI Data Analytics Library (oneDAL)
Intel oneAPI Collective Communications Library (oneCCL)
Intel oneAPI Deep Neural Network Library (oneDNN)
Intel oneAPI Video Processing Library (oneVPL)
Other Libraries
Visible to Intel only — GUID: GUID-F551B918-87BF-48F3-8A38-4B07165E297E
Suggested Coding Styles
For creating your IP, use one of the following recommended general coding styles:
Lambda Coding Style Example: The lambda coding style is typically used in most full-system SYCL programs.
Functor Coding Style Example: You can write your IP component (kernel) code out-of-line from the host code with the functor coding style.
Lambda Coding Style Example
#include <sycl/sycl.hpp> #include <iostream> #include <sycl/ext/intel/fpga_extensions.hpp> #include <vector> using namespace sycl; // Forward declare the kernel name in the global scope. // This is an FPGA best practice that reduces name mangling in the // optimization reports. class SimpleVAdd; #define VECT_SIZE 4 int main() { #ifdef FPGA_EMULATOR sycl::ext::intel::fpga_emulator_selector my_selector; #else sycl::ext::intel::fpga_selector my_selector; #endif queue q(my_selector); int count = VECT_SIZE; // pass array size by value // declare arrays and fill them std::vector<int> VA; std::vector<int> VB; std::vector<int> VC(count); for (int i = 0; i < count; i++) { VA.push_back(i); VB.push_back(count - i); } std::cout << "add two vectors of size " << count << std::endl; // Copy the input arrays into the USM so the kernel can see them int *A = malloc_shared<int>(count, q); int *B = malloc_shared<int>(count, q); int *C = malloc_shared<int>(count, q); std::copy_n(VA.begin(), count, A); std::copy_n(VB.begin(), count, B); // The code inside the lambda expression describes your IP. Inputs // and outputs are inferred from the lambda capture list. q.single_task<SimpleVAdd>([=]() [[intel::kernel_args_restrict]] { [[intel::speculated_iterations(0)]] [[intel::initiation_interval(1)]] for (int i = 0; i < count; i++) { C[i] = A[i] + B[i]; } }) .wait(); // Copy the result back to host memory std::copy_n(C, count, VC.begin()); free(A, q); free(B, q); free(C, q); // verify that VC is correct bool passed = true; for (int i = 0; i < count; i++) { int expected = VA[i] + VB[i]; std::cout << "idx=" << i << ": result " << VC[i] << ", expected (" << expected << ") VA=" << VA[i] << " + VB=" << VB[i] << std::endl; if (VC[i] != expected) { passed = false; } } std::cout << (passed ? "PASSED" : "FAILED") << std::endl; return passed ? EXIT_SUCCESS : EXIT_FAILURE; }
Functor Coding Style Example
With this style, you can specify all the interfaces in one location and make a call to your IP component from your SYCL* host program.
#include <sycl/sycl.hpp> #include <iostream> #include <sycl/ext/intel/fpga_extensions.hpp> #include <vector> using namespace sycl; // Forward declare the kernel name in the global scope. // This is an FPGA best practice that reduces name mangling in the // optimization reports. class SimpleVAdd; // The members of the functor serve as inputs and outputs to your IP. // The code inside the operator()() function describes your IP. class SimpleVAddKernel { int *A, *B, *C; int count; public: SimpleVAddKernel(int *A_in, int *B_in, int *C_out, int count_in) : A(A_in), B(B_in), C(C_out), count(count_in) {} void operator()() const { [[intel::speculated_iterations(0)]] [[intel::initiation_interval(1)]] for (int i = 0; i < count; i++) { C[i] = A[i] + B[i]; } } }; #define VECT_SIZE 4 int main() { #ifdef FPGA_EMULATOR sycl::ext::intel::fpga_emulator_selector my_selector; #else sycl::ext::intel::fpga_selector my_selector; #endif queue q(my_selector); int count = VECT_SIZE; // pass array size by value // declare arrays and fill them std::vector<int> VA; std::vector<int> VB; std::vector<int> VC(count); for (int i = 0; i < count; i++) { VA.push_back(i); VB.push_back(count - i); } std::cout << "add two vectors of size " << count << std::endl; // Copy the input arrays into the USM so the kernel can see them int *A = malloc_shared<int>(count, q); int *B = malloc_shared<int>(count, q); int *C = malloc_shared<int>(count, q); std::copy_n(VA.begin(), count, A); std::copy_n(VB.begin(), count, B); q.single_task<SimpleVAdd>(SimpleVAddKernel{A, B, C, count}).wait(); // Copy the result back to host memory std::copy_n(C, count, VC.begin()); free(A, q); free(B, q); free(C, q); // verify that VC is correct bool passed = true; for (int i = 0; i < count; i++) { int expected = VA[i] + VB[i]; std::cout << "idx=" << i << ": result " << VC[i] << ", expected (" << expected << ") VA=" << VA[i] << " + VB=" << VB[i] << std::endl; if (VC[i] != expected) { passed = false; } } std::cout << (passed ? "PASSED" : "FAILED") << std::endl; return passed ? EXIT_SUCCESS : EXIT_FAILURE; }