Intel FPGA SDK for OpenCL Intel Cyclone V SoC Development Kit Reference Platform Porting Guide
Intel FPGA SDK for OpenCL Intel Cyclone V SoC Development Kit Reference Platform Porting Guide
Before you begin, Intel® strongly recommends that you familiarize yourself with the contents of the following documents:
- Intel® FPGA SDK for OpenCL™ Intel® Cyclone® V SoC Getting Started Guide
- Intel® FPGA SDK for OpenCL™ Custom Platform Toolkit User Guide
- Cyclone V Hard Processor System Technical Reference Manual
In addition, refer to the Cyclone V SoC Development Kit and SoC Embedded Design Suite page for more information.
Overview of the Cyclone V SoC Development Kit Reference Platform
Intel® FPGA SDK for OpenCL™ support for the Cyclone V SoC Development Kit takes advantage of the following board features to maximize the performance of the Cyclone V SoC FPGA:
- FPGA device that contains the FPGA core logic.
- Hard processor system (HPS) with dual core ARM® Cortex®-A9 CPU.
- Shared physical memory between the CPU and the FPGA core fabric.
Cyclone V SoC Development Kit Reference Platform Board Variants
- c5soc board
This default board provides access to two DDR memory banks. The HPS DDR is accessible by both the FPGA and the CPU. The FPGA DDR is only accessible by the FPGA.
- c5soc_sharedonly board
This board variant contains only HPS DDR connectivity. The FPGA DDR is not accessible. This board variant is more area efficient because less hardware is necessary to support one DDR memory bank. The c5soc_sharedonly board is also a good prototyping platform for a final production board with a single DDR memory bank.
To target this board variant when compiling your OpenCL kernel, include the -board=c5soc_sharedonly option in your aoc command.
For more information about the -board=<board_name> option of the aoc command, refer to the Intel® FPGA SDK for OpenCL™ Programming Guide.
Content of the Cyclone V SoC Development Kit Reference Platform
The Cyclone® V SoC Development Kit Reference Platform consists of the following files and directories:
File or Directory | Description |
---|---|
board_env.xml | eXtensible Markup Language (XML) file that describes c5soc to the Intel® FPGA SDK for OpenCL™ . |
linux_sd_card_image.tgz | Compressed SD flash card image file that contains everything an SDK user needs to use the Cyclone V SoC Development Kit with the SDK. |
arm32 | Directory that contains the following:
|
c5soc | Directory that contains the hardware template for the board variant that includes two DDR SDRAM. |
c5soc_sharedonly | Directory that contains the hardware template for the board variant that includes one DDR SDRAM. |
driver | Directory that contains the source codes for the Linux kernel driver, and the program and diagnose utilities. |
Relevant Features of the Cyclone V SoC Development Kit
The following list highlights the Cyclone® V SoC Development Kit components and features that are relevant to the Intel® FPGA SDK for OpenCL™ :
- Dual-core ARM® Cortex®-A9 CPU running 32-bit Linux.
- Advanced eXtensible Interface (AXI) bus between the HPS and the FPGA core fabric.
- Two hardened DDR memory controllers, each connecting to a 1 gigabyte (GB) DDR3 SDRAM.
- One DDR controller is accessible to the FPGA core only (that is, FPGA DDR).
- The other DDR controller is accessible to both the HPS and the FPGA (that is, HPS DDR). This shared controller allows free memory sharing between the CPU and the FPGA core.
- The CPU can reconfigure the FPGA core fabric.
Cyclone V SoC Development Kit Reference Platform Design Goals and Decisions
Below are the c5soc design goals:
- Provide the highest possible bandwidth between kernels on the FPGA and the DDR memory system(s).
- Ensure that computations on the FPGA (that is, OpenCL™ kernels) do not interfere with other CPU tasks that might include servicing peripherals.
- Leave as much FPGA resources as possible for kernel computations instead of interface components.
Below are the high-level design decisions that are the direct consequences of Intel® 's design goals:
- The Reference Platform only uses hard DDR memory controllers with the widest-possible configuration (256 bits).
- The FPGA communicates with the HPS DDR memory controller directly, without involving the AXI bus and the L3 switch inside the HPS. The direct communication provides the best possible bandwidth to DDR, and keeps FPGA computations from interfering with communications between the CPU and its periphery.
- Scatter-gather direct memory access (SG-DMA) is not part of the FPGA interface
logic. Instead of transferring large amounts of data between DDR memory systems, store the
data in the shared HPS DDR. Direct access to CPU memory by the FPGA is more efficient than
DMA. It saves hardware resources (that is, FPGA area) and simplifies the Linux kernel
driver.
Warning: Memory transfer between the shared HPS DDR system and the DDR system that is accessible only to the FPGA is very slow. If you choose to transfer memory in this manner, use it for very small amounts of data only.
- The host and the device perform non-DMA data transfer between each other via the HPS-to-FPGA (H2F) bridge, using only a single 32-bit port. The reason is, without DMA, the Linux kernel can only issue a single 32-bit read or write request, so it is unnecessary to have a wider connection.
- The host sends control signals to the device via a lightweight H2F (LH2F) bridge. Because control signals from the host to the device are low-bandwidth signals, an LH2F bridge is ideal for the task.
Porting the Reference Platform to Your SoC FPGA Board
To port the Cyclone® V SoC Development Kit Reference Platform to your SoC FPGA board, perform the following tasks:
- Select the one DDR memory or the two DDR memories version of the c5soc Reference Platform as the starting point of your design.
-
Update the pin locations in the
INTELFPGAOCLSDKROOT/board/c5soc/<board_variant>/top.qsf file,
where INTELFPGAOCLSDKROOT is the path to the location of the Intel® FPGA SDK for OpenCL™ installation, and
<board_variant> is the directory name of the board variant. The c5soc_sharedonly directory is for the board variant with one DDR memory system. The c5soc directory is for the board variant with two DDR memory systems.
- Update the DDR settings for the HPS and/or FPGA SDRAM blocks in the INTELFPGAOCLSDKROOT/board/c5soc/<board_variant>/system.qsys file.
-
All
Intel® FPGA SDK for OpenCL™
preferred
board designs must achieve guaranteed timing closure. As such, the placement of
the design must be timing clean. To port the c5soc board partition (acl_iface_partition.qxp) to your SoC
FPGA
board, perform the following tasks:
For detailed instructions on modifying and preserving the board partition, refer to the Intel® Quartus® Prime Incremental Compilation for Hierarchical and Team-Based Design chapter of the Intel® Quartus® Prime Standard Edition Handbook.
- Remove the acl_iface_partition.qxp from the INTELFPGAOCLSDKROOT/board/c5soc/c5soc directory.
-
Enable the acl_iface_region Logic Lock region by changing the Tcl command
set_global_assignment -name LL_ENABLED OFF -section_id acl_iface_region
to
set_global_assignment -name LL_ENABLED ON -section_id acl_iface_region
- Compile an OpenCL kernel for your board.
- If necessary, adjust the size and location of the Logic Lock region.
-
When you are satisfied that the placement of your
design is timing clean, export that partition as the acl_iface_partition.qxp
Intel®
Quartus® Prime Exported Partition
File.
As described in the Establishing Guaranteed Timing Flow section of the A Intel® FPGA SDK for OpenCL™ Custom Platform Toolkit User Guide, by importing this .qxp file into the top-level design, you fulfill the requirement of providing a board design with a guaranteed timing closure flow.
For factors that might impact the quality of results (QoR) of your exported partition, refer to the General Quality of Results Considerations for the Exported Board Partition section in the Intel® FPGA SDK for OpenCL™ Custom Platform Toolkit User Guide.
- Disable the acl_iface_region Logic Lock region by reverting the command in Step 2 back to set_global_assignment -name LL_ENABLED OFF -section_id acl_iface_region.
- If your SoC FPGA board uses different pins and peripheries of the HPS block, regenerate the preloader and the device tree source (DTS) file. If you change the HPS DDR memory controller settings, regenerate the preloader.
- Create the SD flash card image.
-
Create your Custom Platform, which includes the SD flash card image.
Consider creating a runtime environment version of your Custom Platform for use with the Intel® FPGA Runtime Environment (RTE) for OpenCL. The RTE version of your Custom Platform does not include hardware directories and the SD flash card image. This Custom Platform loads onto the SoC FPGA system to allow host applications to run. In contrast, the SDK version of the Custom Platform is necessary for the SDK to compile OpenCL kernels.Tip: You may use the SDK version of your Custom Platform for the RTE. To save space, remove the SD flash card image from the RTE version of your Custom Platform.
-
Test your Custom Platform.
Refer to the Testing the Hardware Design section of the Intel® FPGA SDK for OpenCL™ Custom Platform Toolkit User Guide for more information.
Updating a Ported Reference Platform
-
To implement the QXP preservation flow in a Cyclone V SoC
FPGA
hardware design that is ported from a previous version of
c5soc, perform the following steps to create a subpartition to exclude the HPS
from the .qxp file:
-
Before creating a partition around the nonkernel logic,
create a partition around the HPS in the .qsf
Intel®
Quartus® Prime Settings File.
For example:
# Manually partition the instance that models the HPS-dedicated I/O set_instance_assignment -name PARTITION_HIERARCHY borde_18261 -to "system:the_system|system_acl_iface:acl_iface|system_acl_iface_hps_0:hps_0|system_acl_iface_hps_0_hps_io:hps_io|system_acl_iface_hps_0_hps_io_border:border" -section_id "system_acl_iface_hps_0_hps_io_border:border" # Set partition to be an HPS_PARTITION type to be processed correctly by the rest of Quartus set_global_assignment -name PARTITION_TYPE HPS_PARTITION -section_id "system_acl_iface_hps_0_hps_io_border:border"
Modify the setting accordingly because your design hierarchy might be different from the example. -
When exporting the partition for acl_iface_partition, include the --incremental_compilation_export_flatten=off option to
leave the HPS partition as a blackbox.
quartus_cdb top -c top --incremental_compilation_export=acl_iface_partition.qxp --incremental_compilation_export_partition_name=acl_iface_partition --incremental_compilation_export_post_synth=on --incremental_compilation_export_post_fit=on --incremental_compilation_export_routing=on --incremental_compilation_export_flatten=off
After you exclude the HPS from the partition, you may import the .qxp file and compile your design.
-
Before creating a partition around the nonkernel logic,
create a partition around the HPS in the .qsf
Intel®
Quartus® Prime Settings File.
-
Update the SD flash card image with the current version of the
Intel®
FPGA RTE for OpenCL by
performing the following tasks:
- Mount the file allocation table (fat32) and extended file system (ext3) partitions in the existing image as loop-back devices. For detailed instructions, refer to Step 2 in Building an SD Flash Card Image.
- In the /home/root/opencl_arm32_rte directory, remove the files from the previous version of the RTE.
- Download and unpack the current verison of the RTE into the /home/root/opencl_arm32_rte directory.
- In the <path_Custom_Platform>/driver/version.h file of your Custom Platform, update the ACL_DRIVER_VERSION assignment to <SDK_version>.<driver_version> (for example, 16.1.x, where 16.1 is the SDK verison, and x is the driver version that you set).
- Rebuild the driver.
- Delete the hardware folder(s) of your Custom Platform. Copy the Custom Platform, along with the updated driver, to the /home/root/opencl_arm_rte/board directory.
- Copy the Altera.icd file from the /home/root/opencl_arm32_rte directory and add it to the /etc/OpenCL/vendors directory.
- Unmount and test the new image. For detailed instructions, refer to Steps 8 to 11 in Building an SD Flash Card Image.
Software Support for Shared Memory
With respect to the hardware, OpenCL kernels access shared physical memory through direct connection to the HPS DDR hard memory controller. With respect to the software, support for shared physical memory involves the following considerations:
- Typical software implementations for allocating memory on the CPU (for example, the malloc() function) cannot allocate a memory region that the FPGA may use. Memory that the malloc() function allocates is contiguous in the virtual memory address space, but any underlying physical pages are unlikely to be contiguous physically. As such, the host must be able to allocate physically-contiguous memory regions. However, this ability does not exist in user-space applications on Linux. Therefore, the Linux kernel driver must perform the allocation.
- The OpenCL SoC FPGA Linux kernel driver includes the mmap() function to allocate shared physical memory and map it into the user space. The mmap() function uses the standard Linux kernel call dma_alloc_coherent() to request physically-contiguous memory regions for sharing with a device.
- In the default Linux kernel, dma_alloc_coherent() does not allocate physically-contiguous memory more than
0.5 megabytes (MB) in size. To allow dma_alloc_coherent()
to allocate large amounts of physically-contiguous memory, enable the contiguous memory
allocator (CMA) feature of the Linux kernel and then recompile the Linux kernel.
For the Cyclone V SoC Development Kit Reference Platform, CMA manages 512 MB out of 1 GB of physical memory. You may increase or decrease this value, depending on the amount of shared memory that the application requires. The dma_alloc_coherent() call might not be able to allocate the full 512 MB of physically-contiguous memory; however, it can routinely obtain approximately 450 MB of memory.
- The CPU can cache memory that the dma_alloc_coherent() call allocates. In particular, write operations from the host application are not visible to the OpenCL kernels. The mmap() function in the OpenCL SoC FPGA Linux kernel driver also contains calls to the pgprot_noncached() or remap_pf_range() function to disable caching for this region of memory explicitly.
- After the dma_alloc_coherent() function
allocates the physically-contiguous memory, the mmap()
function returns the virtual address to the beginning of the range, which is the address
span of the memory you allocate. The host application requires this virtual address to
access the memory. However, the OpenCL kernels require physical addresses. The Linux kernel
driver keeps track of the virtual-to-physical address mapping. You can map the physical
addresses that mmap() returns to actual physical addresses
by adding a query to the driver.
The aocl_mmd_shared_mem_alloc() MMD application programming interface (API) call incorporates the following queries:
- The mmap() function that allocates memory and returns the virtual address.
- The extra query that maps the returned virtual address to physical address.
The aocl_mmd_shared_mem_alloc() MMD API call then returns two addresses—the actual returned address is the virtual address, and the physical address goes to device_ptr_out.
Note: The driver can only map the virtual addresses that the mmap() function returns to physical addresses. If you request for the physical address of any other virtual pointer, the driver returns a NULL value.
With respect to the runtime library, use the clCreateBuffer() call to allocate the shared memory as a device buffer in the following manner:
- For the two-DDR board variant with both shared and nonshared memory, clCreateBuffer() allocates shared memory if you specify the CL_MEM_USE_HOST_PTR flag. Using other flags causes clCreateBuffer() to allocate buffer in the nonshared memory.
- For the one-DDR board variant with only shared memory, clCreateBuffer() allocates shared memory regardless of which flag you specify.
Currently, 32-bit Linux support on ARM® CPU governs the extent of shared memory support in the SDK runtime libraries. In other words, runtime libraries compiled to other environments (for example, x86_64 Linux or 64-bit Windows) do not support shared memory.
C5soc did not implement heterogeneous memory to distinguish between shared and nonshared memory for the following reasons:
- History—Heterogeneous memory support was not available when shared memory support was originally created.
- Uniform interface—Because OpenCL is an open standard, Intel® maintains consistency between heterogeneous computing platform vendors. Therefore, the same interface as other board vendors' architectures is used to allocate and use shared memory.
FPGA Reconfiguration
- To view the status of the FPGA core, invoke the cat /sys/class/fpga/fpga0/status command.
The Intel® FPGA SDK for OpenCL™ program utility available with the Cyclone® V SoC Development Kit Reference Platform uses this interface to program the FPGA. When reprogramming an FPGA core with a running CPU, the program utility performs all of the following tasks:
- Prior to reprogramming, disable all communication bridges between the FPGA and
the HPS, both H2F and LH2F bridges.
Reenable these bridges after reprogramming completes.
Attention: The OpenCL system does not use the FPGA-to-HPS (F2H) bridge. Refer to the HPS-FPGA Memory-Mapped Interfaces section in the Cyclone V Hard Processor System Technical Reference Manual for more information. - Ensure that the link between the FPGA and the HPS DDR controller is disabled during reprogramming.
- Ensure that the FPGA interrupts on the FPGA are disabled during reprogramming. Also, notify the driver to reject any interrupts from the FPGA during reprogramming.
Consult the source code of the program utility for details on the actual implementation.
Do not change the configuration of the HPS DDR controller when the CPU is running. Doing so might cause a fatal system error because you might change the DDR controller configuration when there are outstanding memory transactions from the CPU. This means that when the CPU is running, you may not reprogram the FPGA core with an image that uses HPS DDR in a different configuration.
Remember that the OpenCL system, and the Golden Hardware reference design available with the Intel® SoC FPGA Embedded Design Suite (EDS), sets the HPS DDR into a single 256-bit mode.
CPU system parts such as the branch predictor or the page table prefetcher might issue DDR commands even when it appears that nothing is running on the CPU. Therefore, boot time is the only safe time to set the HPS DDR controller configuration. This also implies that U-boot must have a raw binary file (.rbf) image to load into memory. Otherwise, you might be enabling the HPS DDR with unused ports on the FPGA and then potentially changing the port configurations afterwards. For this reason, the OpenCL Linux kernel driver no longer includes the logic necessary to set the HPS DDR controller configuration.
The SW3 dual in-line package (DIP) switches on the Cylone V SoC Development Kit control the expected form of the .rbf image (that is, whether the file is compressed and/or encrypted). C5soc, and the Golden Hardware Reference Design available with the SoC EDS, include compressed but unencrypted .rbf images. The SW3 DIP switch settings described in the Intel® FPGA SDK for OpenCL™ Cyclone V SoC Getting Started Guide match this .rbf image configuration.
FPGA System Architecture Details
The following FPGA core components are the same in both c5soc and s5_ref:
- VERSION_ID block
- Rest mechanism
- Memory bank divider
- Cache snoop interface
- Kernel clock
- Control register access (CRA) blocks
Building an SD Flash Card Image
Modifying an Existing SD Flash Card Image
The c5soc linux_sd_card_image.tgz image file is available in the INTELFPGAOCLSDKROOT/board/c5soc directory, where INTELFPGAOCLSDKROOT points to the path of the Intel® FPGA SDK for OpenCL™ 's installation directory.
- To decompress the $INTELFPGAOCLSDKROOT/board/c5soc/linux_sd_card_image.tgz file, run the tar xvfz linux_sd_card_image.tgz command.
-
Compile the hello_world OpenCL example
design using your Custom Platform support. Rename the .rbf file that the
Intel® FPGA SDK for OpenCL™ Offline Compiler generates as opencl.rbf, and place it on the fat32 partition
within the SD flash card image.
You can download the hello_world example design from the OpenCL Design Examples page on the Altera website.
-
Place the .rbf file into
the fat32 partition of the flash card image.
Attention: The fat32 partition must contain both the zImage file and the .rbf file. Without a .rbf file, a fatal error will occur when you insert the driver.
-
After you create the SD card image, write it to a micro SD
card by invoking the following command:
sudo dd if=/path/to/sdcard/image.bin of=/dev/sdcard
-
To test your SD flash card image, perform the following tasks:
- Insert the micro SD flash card into the SoC FPGA board.
- Power up the board.
- Invoke the aocl diagnose utility command.
Creating an SD Flash Card Image
The steps below describe the procedure for creating the linux_sd_card_image.tgz image from the Golden System Reference Design (GSRD) SD flash card image:
- Download and unpack the GSRD SD flash card image version 17.0 from Rocketboards.org.
-
Mount the file allocation table (fat32) and extended file
system (ext3) partitions in this image as loop-back devices. To mount a
partition, perform the following steps:
-
Determine the byte start of the partition within the
image by invoking the /sbin/fdisk -lu
image_file command.
For example, partition number 1 of type W95 FAT has a block offset of 2121728. With 512 bytes per block, the byte offset is 512 bytes x 2121728 = 1086324736 bytes.
- Identify a free loop device (for example, /dev/loop0) by typing the losetup -f command.
- Assuming /dev/loop0 is the free loop device, assign your flash card image to the loop block device by invoking the losetup /dev/loop0 image_file -0 1086324736 command.
-
Mount the loop device by invoking the mount /dev/loop0 /media/disk1 command.
Within the image file, /media/disk1 is now a mounted fat32 partition.
- Repeat steps a to d for the ext3 partition.
-
Determine the byte start of the partition within the
image by invoking the /sbin/fdisk -lu
image_file command.
-
Download the Cyclone V SoC
FPGA
version of the
Intel®
FPGA Runtime
Environment for OpenCL package from the Download Center on the Altera
website.
- Click the Download button beside Intel® Quartus® Prime software edition.
- Specify the release version, the operating system, and the download method.
- Click the Additional Software tab, and select to download Intel® FPGA Runtime Environment for OpenCL Linux Cyclone V SoC TGZ.
- After you download the aocl-rte-<version>.arm32.tgz file, unpack it to a directory that you own.
- Place the unpacked aocl-rte-<version>.arm32 directory into the /home/root/opencl_arm32_rte directory on the ext3 partition of the image file.
- Delete the hardware folder(s) of your Custom Platform, and then place the Custom Platform into the board subdirectory of /home/root/opencl_arm32_rte.
-
Create the init_opencl.sh
file in the /home/root directory with the
following content:
export INTELFPGAOCLSDKROOT=/home/root/opencl_arm32_rte export AOCL_BOARD_PACKAGE_ROOT=$INTELFPGAOCLSDKROOT/board/<board_name> export PATH=$INTELFPGAOCLSDKROOT/bin:$PATH export LD_LIBRARY_PATH=$INTELFPGAOCLSDKROOT/host/arm32/lib:$LD_LIBRARY_PATH insmod $AOCL_BOARD_PACKAGE_ROOT/driver/aclsoc_drv.ko
The SDK user runs the source ./init_opencl.sh command to load the environment variables and the OpenCL Linux kernel driver.
-
If you need to update the preloader, the DTS files, or the
Linux kernel, you need the arm-linux-gnueabihf-gcc compiler from the SoC EDS.
Follow the instructions outlined in the
Intel®
SoC
FPGA
Embedded Design Suite User Guide to acquire the
software, recompile them, and update the relevant files on the mounted fat32
partition.
Attention: It is most likely that you need to update the preloader if your Custom Platform has different pin usages than those in c5soc.Remember:
If you recompile the Linux kernel, recompile the Linux kernel driver with the same Linux kernel source files. If there is a mismatch between the Linux kernel driver and the Linux kernel, the driver will not load. Also, you must enable the CMA.
Refer to Recompiling the Linux Kernel for more information.
-
Compile the hello_world OpenCL example
design using your Custom Platform support. Rename the .rbf file that the
Intel® FPGA SDK for OpenCL™ Offline Compiler generates as opencl.rbf, and place it on the fat32 partition
within the SD flash card image.
You can download the hello_world example design from the OpenCL Design Examples page on the Altera website.
-
After you store all the necessary files onto the flash card
image, invoke the following commands:
- sync
- unmount /media/disk1
-
unmount <ext3_partition_directory>
where <ext3_partition_directory> is the directory name you use for mounting the ext3 partition in 2 (for example, /media/disk2).
- losetup -d /dev/loop0
- losetup -d /dev/loop1
-
Compress the SD flash card image by invoking the following
command:
tar cvfz <my_linux_sd_card_image>.tgz linux_sd_card_image
- Deliver the <my_linux_sd_card_image>.tgz file inside the root directory of your Custom Platform.
-
To test your SD flash card image, perform the following
tasks:
- Write the resulting uncompressed image onto a micro SD flash card.
- Insert the micro SD flash card into the SoC FPGA board.
- Power up the board.
- Invoke the aocl diagnose utility command.
Compiling the Linux Kernel for Cyclone V SoC FPGA
Recompiling the Linux Kernel
-
Click the GSRD v14.0 - Compiling
Linux link on the Resources page of the RocketBoards.org website to
access instructions on downloading and rebuilding the Linux kernel source code.
For use with the ™ Intel® FPGA SDK for OpenCL™ , specify socfpga-3.13-rel14.0 as the <test_branch_name>.
-
Note: The building process creates the arch/arm/configs/socfpga_defconfig file. This file specifies the settings for the socfpga default configuration.Add the following lines to the bottom of the arch/arm/configs/socfpga_defconfig file.
CONFIG_MEMORY_ISOLATION=y CONFIG_CMA=y CONFIG_DMA_CMA=y CONFIG_CMA_DEBUG=y CONFIG_CMA_SIZE_MBYTES=512 CONFIG_CMA_SIZE_SEL_MBYTES=y CONFIG_CMA_ALIGNMENT=8 CONFIG_CMA_AREAS=7
The CONFIG_CMA_SIZE_MBYTES configuration value sets the upper limit on the total number of physically contiguous memory available. You may increase this value if you require more memory.Attention: The total amount of physical memory available to the ARM® processor on the SoC FPGA board is 1 GB. Intel® does not recommend that you set the CMA manager close to 1 GB. - Run the make mrproper command to clean the current configuration.
-
Run the make ARCH=arm
socfpga_deconfig command.
ARCH=arm indicates that you want to configure the ARM architecture. socfpga_defconfig indicates that you want to use the default socfpga configuration.
-
Run the export
CROSS_COMPILE=arm-linux-gnueabihf- command.
This command sets the CROSS_COMPILE environment variable to specify the prefix of the desired tool chain.
- Run the make ARCH=arm zImage command. The resulting image is available in the arch/arm/boot/zImage file.
- Place the zImage file into the fat32 partition of the flash card image. For detailed instructions, refer to the Cyclone V SoC FPGA-specific GSRD User Manual on Rocketboards.org.
-
Note: To correctly insert the OpenCL Linux kernel driver, first load an SDK-generated .rbf file onto the FPGA.To create the .rbf file, compile an SDK design example with the Cyclone® V SoC Development Kit Reference Platform as the targeted Custom Platform.
-
Place the .rbf file into
the fat32 partition of the flash card image.
Attention: The fat32 partition must contain both the zImage file and the .rbf file. Without a .rbf file, a fatal error will occur when you insert the driver.
- Insert the programmed micro SD card, which contains the SD card image you modified or created earlier, into the Cyclone V SoC Development Kit and then power up the SoC FPGA board.
- Verify the version of the installed Linux kernel by running the uname -r command.
-
To verify that you enable the CMA successfully in the kernel,
with
the SoC FPGA board powered up, run the grep init_cma /proc/kallsyms command.
CMA is enabled if the output is non-empty.
- To use the recompiled Linux kernel with the SDK, compile and install the Linux kernel driver.
Compiling and Installing the OpenCL Linux Kernel Driver
-
Download the Cyclone V SoC
FPGA
version of the
Intel®
FPGA Runtime
Environment for OpenCL package from the Download Center on the Altera
website.
- Click the Download button beside Intel® Quartus® Prime software edition.
- Specify the release version, the operating system, and the download method.
- Click the Additional Software tab, and select to download Intel® FPGA Runtime Environment for OpenCL Linux Cyclone V SoC TGZ.
- After you download the aocl-rte-<version>.arm32.tgz file, unpack it to a directory that you own.
The driver source is in the aocl-rte-<version>.arm32/board/c5soc/driver directory. - To recompile the OpenCL Linux kernel driver, set the KDIR value in the driver's Makefile to the directory containing the Linux kernel source files.
- Run the export CROSS_COMPILE=arm-linux-gnueabihf- command to indicate the prefix of your tool chain.
- Run the make clean command.
- Run the make command to create the aclsoc_drv.ko file.
-
Transfer the opencl_arm32_rte directory to the Cyclone V SoC FPGA
board.
Running the scp -r <path_to_opencl_arm32_rte> root@your-ip-address:<directory> command places the runtime environment in the /home/root directory.
- Run the init_opencl.sh script that you created when you built the SD card image.
- Invoke the aocl diagnose utility command. The diagnose utility will return a passing result after you run init_opencl.sh successfully.
Known Issues
- You cannot override the vendor and board names reported by the CL_DEVICE_VENDOR and CL_DEVICE_NAME strings of the clGetDeviceInfo() call.
- If the host allocates constant memory in shared DDR system (that is, HPS DDR)
and it modifies the constant memory after kernel execution, the data in memory might become
outdated. This issue arises because the FPGA core cannot snoop on CPU-to-HPS DDR
transactions.
To prevent subsequent kernel executions from accessing outdated data, implement one of the following workarounds:
- Do not modify constant memory after its initialization.
- If you require multiple __constant data sets, create multiple constant memory buffers.
- If available, allocate constant memory in the FPGA DDR on your accelerator board.
- The SDK utility on ARM® only supports the program and
diagnose utility commands.
The flash, install and uninstall utility commands are not applicable to the Cyclone V SoC Development Kit for the following reasons:
- The install utility has to compile
the aclsoc_drv Linux kernel driver and enable it on
the SoC FPGA. The development machine has to perform the compilation; however, it
already contains Linux kernel sources for the SoC FPGA. The Linux kernel sources for the
development machine are different from those for the SoC FPGA. The location of the Linux
kernel sources for the SoC FPGA is likely unknown to the SDK user. Similarly, the uninstall utility is also unavailable to the Cyclone V SoC Development Kit.
Also, delivering aclsoc_drv to the SoC board is challenging because the default distribution of the Cyclone V SoC Development Kit does not contain Linux kernel include files or the GNU Compiler Collection (GCC) compiler.
- The flash utility requires placing a .rbf file of an OpenCL design onto the FAT32 partition of the micro SD flash card. Currently, this partition is not mounted when the SDK user powers up the board. Therefore, the best way to update the partition is to use a flash card reader and the development machine.
- The install utility has to compile
the aclsoc_drv Linux kernel driver and enable it on
the SoC FPGA. The development machine has to perform the compilation; however, it
already contains Linux kernel sources for the SoC FPGA. The Linux kernel sources for the
development machine are different from those for the SoC FPGA. The location of the Linux
kernel sources for the SoC FPGA is likely unknown to the SDK user. Similarly, the uninstall utility is also unavailable to the Cyclone V SoC Development Kit.
- When switching between the Intel® FPGA SDK for OpenCL™ Offline Compiler executable files (.aocx) that correspond to different board variants (that is, c5soc and c5soc_sharedonly), you must use the SDK's program utility to load the .aocx file for the new board variant for the first time. If you simply run the host application using a new board variant but the FPGA contains the image from another board variant, a fatal error might occur.
- The .qxp file does not include the interface partition assignments because the Intel® Quartus® Prime software consistently meets timing requirements of this partition.
- When you power up the board, its media access control (MAC) address is set to a random number. If your LAN policy does not
allow this behavior, set the MAC address by performing the following tasks:
- During U-Boot power-up, press any key to enter the U-Boot command prompt.
- Type setenv ethaddr 00:07:ed:00:00:03 at the command prompt.
You may choose any MAC address.
- Type the saveenv command.
- Reboot the board.
Document Revision History
Date | Version | Changes |
---|---|---|
November 2017 | 2017.11.03 |
|
May 2017 | 2017.05.08 |
|
Octoboer 2016 | 2016.10.31 |
|
May 2016 | 2016.05.02 |
|
November 2015 | 2015.11.02 |
|
May 2015 | 15.0.0 |
|
December 2014 | 14.1.0 |
|
July 2014 | 14.0.0 |
|