Generate Multiple FPGA Images (Linux only)
Use this feature of the Intel® oneAPI DPC++/C++ Compiler when you want
to split your FPGA compilation into different FPGA images. This feature
is particularly useful when your design does not fit on a single FPGA.
You can use it to split your very large design into multiple smaller
images, which you can use to partially reconfigure your FPGA device.
You can split your design using one of the following approaches, each
giving you different benefits:
- Dynamic Linking Flow
- Dynamic Loading Flow
Between the two flows, dynamic linking is easier to implement than
dynamic loading. However, dynamic linking can require more memory on the
host device as all of the device images must be loaded into memory.
Dynamic loading addresses these limitations but introduces the need for
some extra source-level changes. The following comparison table
highlights the differences between the flows:
Dynamic Linking | Dynamic Loading | |
---|---|---|
Can dynamically change FPGA Image at runtime? | Yes | Yes |
Defining the type and number of FPGA images | At compile time | At runtime |
Host-program memory footprint | All FPGA images are stored in memory at runtime. | Only explicitly loaded FPGA images are stored in memory. |
Calling host code | Call function in the dynamic library directly. | Explicitly load the dynamic library and functions to call. |
Dynamic Linking Flow
This flow allows you to split your design into different source files and
map them into a separate FPGA image. Intel® recommends this flow
for designs with a small number of FPGA images.
To use this flow, perform the following steps:
- Split your source code such that for each FPGA image you want, you create a separate.cppfile that submits various kernels. Separate the host code into one or more.cppfiles that can then interface with functions in the kernel files.Consider that you now have the following three files:
- main.cppcontaining your host code. For example:// host.cpp int main() { queue queueA; add(queueA); mul(queueA); }
- vector_add.cppcontaining a function that submits thevector_addkernel. For example:// vector_add.cpp extern "C"{ void add(queue queueA) { queue.submit( // Kernel Code ); } }
- vector_mul.cppcontaining a function that submits thevector_mulkernel. For example:// vector_mul.cpp extern "C"{ void mul(queue queueA) { queue.submit( // Kernel Code ); } }
- Compile the source files using the following commands:dpcpp -fPIC -fintelfpga -c vector_add.cpp -o vector_add.o dpcpp -fPIC -fintelfpga -c vector_mul.cpp -o vector_mul.o // FPGA image compiles take a long time to complete dpcpp -fPIC -shared -fintelfpga vector_add.o -o vector_add.so -Xshardware -Xsboard=pac_a10 dpcpp -fPIC -shared -fintelfpga vector_mul.o -o vector_mul.so -Xshardware -Xsboard=pac_a10 // Final link step dpcpp -o main.exe main.cpp vector_add.so vector_mul.so
With this flow, the long FPGA compile steps are split into separate
commands that you can potentially run on different systems or only when
you change the files.
Dynamic Loading Flow
Use this flow to avoid loading all of the different FPGA
images into memory at once. Similar to dynamic linking flow, this flow
also requires you to split your code. However, for this flow, you must
load the
.so
(shared object) files in the host program. The
advantage of this flow is that you can load large FPGA image files
dynamically as necessary instead of linking all image files at
compile time.To use this flow, perform the following steps:
- Split your source code in the same manner as done in step 1 of the dynamic linking flow.
- Modify thehost.cppfile to appear as follows:// host.cpp #include <dlfcn.h> int main() { queue queueA; bool runAdd, runMul; // Assuming runAdd and runMul are set dynamically at runtime if (runAdd) { auto add_lib = dlopen("./add.so", RTLD_NOW); auto add = dlsym(add_lib, "add"); add(queueA); } if (runMul) { mul_lib = dlopen("./mul.so", RTLD_NOW); mul = dlsym(mul_lib, "mul"); mul(queueA); } }
- Compile the source files using the following commands:You do not have to link the .so files at compile time since they are loaded dynamically at runtime.dpcpp -fPIC -fintelfpga -c vector_add.cpp -o vector_add.o dpcpp -fPIC -fintelfpga -c vector_mul.cpp -o vector_mul.o // FPGA Image compiles take a long time to complete dpcpp -fPIC -shared -fintelfpga vector_add.o -o vector_add.so -Xshardware -Xsboard=pac_a10 dpcpp -fPIC -shared -fintelfpga vector_mul.o -o vector_mul.so -Xshardware -Xsboard=pac_a10 // Final link step. Ensure you add the path containing the .so files to LD_LIBRARY_PATH LD_LIBRARY_PATH=./:$LD_LIBRARY_PATH dpcpp -o main.exe main.cpp vector_add.so vector_mul.so
With this approach, you can arbitrarily load many
.so
files at
runtime. This is useful when you have a large library of
FPGA images, and you want to select a subset of files from it.