1. Hierarchical Partial Reconfiguration over PCI Express Reference Design for Intel Arria 10 Devices
The Hierarchical Partial Reconfiguration (HPR) over
design demonstrates reconfiguring the FPGA fabric through the
Arria® 10 devices.
This reference design runs on a Linux system with the
GX FPGA development board. Adapt this reference design to your requirements by implementing the
PR region logic using the given template. Run your custom design on this fully functional system
that enables communication over
Arria® 10 devices use the PR over
solution to reconfigure the device, rather than
Configuration via Protocol (CvP) update. Partial reconfiguration allows you to reconfigure a
portion of the FPGA dynamically, while the remaining FPGA design continues to function. Create
multiple personas for a particular region in your design, without impacting operation in areas
outside this region. Partial reconfiguration enables the implementation of more complex FPGA
You can also include multiple parent and child
partitions, or create multiple levels of partitions in your design. This hierarchical partial
reconfiguration (HPR) flow includes a static region that instantiates the parent PR region,
and the parent PR region instantiating the corresponding child PR region. You can perform the
same PR region reprogramming for either the child or the parent partition. Reprogramming a
child PR region does not affect the parent or the static region. Reprogramming the parent
region reprograms the associated child region with the default child persona, without
affecting the static region.
Partial reconfiguration provides the following advancements to a flat
Allows run-time design reconfiguration
Increases scalability of the design through time-multiplexing
Lowers cost and power consumption through efficient use of board
Supports dynamic time-multiplexing functions in the design
Improves initial programming time through smaller bitstreams
Reduces system down-time through line upgrades
Enables easy system update by allowing remote hardware change
a10_pcie_reference_design—top-level reference design wrapper connecting the
board support package (BSP) subsystem to the device pins.
bsp_top—design top-level containing all
subsystems. Includes the
Arria® 10/Cyclone 10 Hard IP for
, the External Memory Interfaces
Intel® FPGA IP, and the design top module. This abstraction layer
allows simulation of the design top module through simulated
Avalon® memory-mapped transactions.
design_core—design core that generates of
the PR region and the interface components, such as clock crossing Avalon memory mapped
logic and pipeline logic, clocks, and the global reset.
The reference design creates a separate IOPLL
Intel® FPGA IP-generated clock. This clock creation decouples the PR logic
clocking from both the
clocking domain that
runs at 250 MHz, and the external memory interface (EMIF) clocking domain that runs
at 330 MHz. The clock for PR logic is set at 250 MHz. To ease timing closure,
modify the parameterization of the IOPLL IP core to a lower clock frequency.
1.1.2. Memory Address Mapping
The Intel Arria 10/Cyclone 10 Hard IP for PCI Express IP connects to
the design core through two
memory mapped master interfaces. These
memory mapped master interfaces are base address registers (BARs),
BAR 2 and BAR 4. BAR 2 connects the PR driver to the following components:
The Partial Reconfiguration Controller
Arria® 10/Cyclone 10 GX FPGA IP
The system description ROM
The BAR 4
connects to the following components:
The freeze bridges
The Partial Reconfiguration Region Controller
Intel® FPGA IP
Up to 8 kilobytes
(KB) of memory in the PR region
The following table lists the memory address mapping for the Intel
Arria 10/Cyclone 10 Hard IP for PCI Express IP:
Table 1. Intel Arria 10/Cyclone 10 Hard IP for PCI Express IP Memory
System Description ROM
PR Region Controller
DDR4 Calibration Export
The External Memory Interfaces
Arria® 10 FPGA IP provides status on DDR4 calibration. During
initialization, the External Memory Interfaces
Arria® 10 FPGA IP performs training to reset the DDR4 interface.
The EMIF calibration flag reports the training success or the failure to the host.
The host takes the necessary action in the event of a DDR4 training failure.
The following table lists the memory address mapping from the External Memory
Arria® 10 FPGA IP to the PR
The PR logic accesses the 2 gigabyte (GB) DDR4 memory space using an
Avalon® memory mapped master interface.
The floorplan constraints in your partial
reconfiguration design physically partition the device.
ensures that the resources available to the PR region are the same for any persona that you
To maximize the fabric amount available for the
PR region, the reference design constrains the static region to the smallest
possible area. This reference design contains two child PR regions of the parent PR
Figure 2. Reference Design Floorplan
Note: The child
regions in the parent PR region can be of any size. The small sizing of the child PR
regions in the above figure is for demonstration purposes only.
1.3. Getting Started
This section describes the requirements and the procedure to run the reference
1.3.1. Hardware and Software Requirements
The reference design requires and uses the following hardware and software
Arria® 10 GX FPGA development board
with connection of the DDR4 module to the Hi-Lo interface
Linux Operating System - kernel version 3.10 or above
Super user access on the host machine
slot to plug-in the
Arria® 10 GX FPGA development
Open source driver for this PR over
Quartus® Prime Pro Edition software
Intel® FPGA Download Cable driver
Validation testing uses CentOS 7 to test the open source driver for this
The Linux driver accompanying this reference design is not a production
driver. You must adapt this driver based on your design.
1.3.2. Installing the Intel Arria 10 GX FPGA Development Kit
For complete instructions on installing
and powering the
Arria® 10 GX FPGA development board in
your Linux system, refer to
Arria® 10 FPGA Development Kit User Guide.
Note: Before powering the board, set
the switch 4 (FACTORY) of the DIP switch bank (SW6)
to ON. Setting this switch to ON loads the factory image area of the flash memory
at boot time. Program the reference design into this factory image area. For
complete instructions on flashing the reference design onto the board, refer to
Bringing Up the Reference Design.
To compile all the necessary driver modules, run the following
To enable verbose messaging, use the option VERBOSE=true with the make
Ensure that the following three kernel object files are
present in the driver source directory after running this command:
To copy the modules to the right location and update the module
dependency database, run the following
sudo make install
To deploy an instance of the driver for each Intel FPGA device, run the following
sudo modprobe fpga-pcie-mod
To verify successful installation of the driver, run the
successful installation, the resulting output displays the following at the
Kernel driver in use: fpga-pcie
Kernel modules: fpga_pcie_mod
Note: The above command functions only after you load the design onto the board
and power-cycle the computer.
Uninstalling the Linux Driver
If you wish to uninstall the Linux driver, follow these steps:
Run the following
sudo modprobe -r fpga-pcie-mod
command stops the driver from executing and deactivates the driver. However, at
this point, rebooting your machine continues to reload the driver.
To permanently delete the driver, run the following
cd /lib/modules/$(uname -r)/extra
rm -rf fpga-pcie-mod.ko
1.4. Reference Design Components
The reference design contains the following design components.
1.4.1. BSP Top
This Platform Designer system contains all the
subsystems of this reference design.
comprises the following three main components:
The top-level design
Intel Arria 10/Cyclone 10 Hard IP for PCI Express IP
External Memory Interfaces
The system connects to external pins through the a10_pcie_ref_design.sv wrapper.
22.214.171.124. Intel Arria 10/Cyclone 10 Hard IP for PCI Express IP
Instantiate the Intel Arria 10/Cyclone 10 Hard IP for PCI Express IP as part of
a Platform Designer system. The Intel Arria 10/Cyclone
10 Hard IP for PCI Express IP is Gen3x8 with a 256-bit interface, running at 250
The following table provides information on the
IP parameters that the reference design uses that are
different from the default settings:
Table 3. Intel Arria 10/Cyclone 10 Hard IP for PCI Express IP
Avalon-MM with DMA
Gen3:x8, Interface: 256-bit, 250 MHz
RX buffer credit
allocation for received requests vs
register access (CRA) Avalon-MM slave port
Base Address Registers
32-bit non-prefetchable memory
Base Address Registers
32-bit non-prefetchable memory
Capabilities - Device
and Extension Options
Arria® 10 GX FPGA Development
far-end TX preset
126.96.36.199. External Memory Interfaces Intel Arria 10 FPGA IP
The ddr4_emif logic includes the External
Arria® 10 FPGA IP. This IP
core interfaces to the DDR4 external memory, with a 64-bit interface that runs at
1066.0 MHz. Also, the IP core provides 2 GB of DDR4 SDRAM memory space. The EMIF
Avalon®-MM slave runs at 300 MHz.
The following table lists the External Memory Interfaces
Arria® 10 FPGA IP parameters that are different
Arria® 10 GX FPGA Development Kit
with DDR4 HILO preset settings:
This component forms the core of the design, and includes the following:
Partial Reconfiguration Controller
Arria® 10/Cyclone 10 GX FPGA IP
Clock crossing and pipe-lining for
Avalon® memory mapped transactions
System description ROM
188.8.131.52.1. Global Reset Logic
The PLL generates the main clock for this design. All logic, excluding the
pcie ip, pr
ip, and ddr4 emif run using this 250 MHz
clock. The Intel Arria 10/Cyclone 10 Hard IP for PCI Express IP generates the global
reset, along with the PLL reset signal. On power up, a countdown timer, tcd2um, counts down using the internal 50 MHz
oscillator to a 830 μs delay. Until the timer reaches this delay, the PLL is held in
reset, deasserting the locked signal. This action freezes the design. Because the
PLL locked signal is ORed with the
reset, the design also is held in reset. Once
the timer reaches 830 μs, the design functions normally, and enters a known
184.108.40.206.2. PR Region Wrapper
The PR region wrapper contains the PR region controller, freeze bridges, and
the personas. The PR region controller interacts with the driver over the
mapped interface to initiate PR. The PR region controller acts as a bridge
to communicate with the PR region for freeze and start requests initiation.
220.127.116.11.3. Parent PR Region
The parent PR region contains two sets of PR region controllers and freeze bridges, because of
the two child personas within the parent PR region. Each set of PR region controller
and freeze bridge is dedicated to one child persona.
18.104.22.168.4. Partial Reconfiguration Region Controller Intel FPGA IP
Use the Partial Reconfiguration Region Controller
Intel® FPGA IP to initiate a freeze request to the PR region. The PR region
finalizes any actions, on freeze request acknowledgment. The freeze bridges also
intercept the Avalon memory-mapped interfaces to the PR region, and correctly
responds to any transactions made to the PR region during partial reconfiguration.
Finally, on PR completion, the region controller issues a stop request, allowing the
region to acknowledge, and act accordingly. The fpga-region-controller program that this reference design includes
performs these functions.
The reference design configures the Partial Reconfiguration Region Controller
Intel® FPGA IP to operate as an internal host. The
design connects this IP core to the Intel Arria 10/Cyclone 10 Hard IP for PCI
Express, via an instance of the Avalon memory-mapped interface. The Partial
Arria® 10/Cyclone 10 GX
FPGA IP has a clock-to-data ratio of 1. Therefore, the Partial Reconfiguration
Arria® 10/Cyclone 10 GX FPGA IP cannot
handle encrypted or compressed PR data.
22.214.171.124.5. Partial Reconfiguration Logic
The reference design provides the following personas:
Table 5. Reference Design Personas
Performs a sweep across a memory span, first
writing, and then reading each address.
Provides access to a 27x27 DSP multiplier and demonstrates hardware
Includes a basic 32-bit unsigned adder and demonstrates hardware
Game of Life
Includes an 8x8 Conway's Game of Life and demonstrates hardware
A wrapper that instantiates two child partitions.
The parent persona also connects the two child personas to the
static region with their own PR region controller, BAR freeze
bridge, and DDR4 freeze bridge.
Each persona has an 8-bit persona_id
field in the pr_data register to indicate a unique
identification number. A 32-bit control register and 16 I/O registers follow the
8-bit persona_id. The 16 I/O registers are 32-bit
each, with 8 bits for device inputs, and 8 bits for device outputs. Each persona
uses these registers in different ways. For more information, refer to the source
code for each of the personas.
Additionally, the reference design provides a template to implement
your custom persona. This template persona allows you modify the RTL, create a
wrapper to interface with the register file, compile, and run your design.
1.5. Compiling the Reference Design
Ensure that the project's a10_pcie_devkit_cvp.qsf file includes the following
This assignment imports the .qdb file
representing the reference design static region logic into the subsequent PR
persona implementation compile. Each implementation revision also contains
one or two ENTITY_REBINDING assignment.
This assignment links the hierarchy of the static region and the hierarchy
of the PR persona. For example, a10_pcie_devkit_cvp_ddr4_access.qsf contains the following
assignment imports the .qdb file
representing the HPR parent region into the subsequent HPR child region
compilation. Because the HPR child revisions comprise of two child regions, they
contain two ENTITY REBINDING
To access the reference design, navigate to the ref_designs sub-folder. Copy the
a10_pcie_devkit_cvp_hpr folder to the home directory in your Linux
To bring up the reference design on the board:
GX FPGA development board to an available
slot in your host machine.
Connect the host machine's ATX auxiliary power
connector to the 12 V ATX input J4 of the development board.
Power-up the host machine.
Verify the micro-USB cable connection to the FPGA
development board. Ensure that no other applications that use the JTAG chain
Navigate to the
a10_pcie_devkit_cvp_hpr/software/installation folder in your
To overwrite the existing factory image on the board with
the reference design, execute the flash.pl script.
Pass the JTAG cable number of your connected device as an argument to the
script (for example, perl flash.pl 1).
Running this script configures the device with the
contents of the flash.pof file.
This parallel object file comes directly from the a10_pcie_devkit_cvp.sof file present in the project
directory. The flash.pof file acts
as the base image for the reference design.
Note: Ensure successful compilation of the design before
running this script.
Power-cycle the host machine.
1.7. Testing the Reference Design
The reference design provides the following utilities for programming the FPGA
The design also includes an example_host_uio
application to communicate with the device and demonstrate each of the personas.
Use the program_fpga_jtag script to
program the entire device (full-chip programming) without any requirement for reboot.
program_fpga_jtag performs the following
Quartus® Prime Programmer to
program the device.
Accepts an SRAM Object File (.sof)
and configures the target device over a JTAG interface.
Communicates with the driver to perform the following functions:
Disable upstream AER (advanced error reporting)
Table 7. program_fpga_jtag Command-Line
Specifies the .sof file
-c= ,--cable=[<cable number>]
Specifies the programmer cable.
Specifies the index of the target device in the JTAG chain.
Provides help for program_fpga_jtag script.
Note: Use the following command to the
obtain the device index:
example, consider that the command returns the following
03:00.0 Class ea00: Intel FPGA Device 5052 (rev 01)
first value is the device index. Prepend 0000 to this value. In this case, your device index
Use the fpga-configure utility to perform
partial reconfiguration. The script accepts a .rbf file
for a given persona. The script performs the following functions:
Communicates with the driver to remove device sub-drivers, if any
Communicates with the fpga-region-controller script to assert/de-assert freeze
Writes the .rbf to the Partial
Reconfiguration Controller IP core
Re-deploys the sub-drivers, if any, that are required upon successful
To perform PR over
, run the following
Performs partial reconfiguration over
Disables the advanced error reporting on the
link. Advanced error reporting generally
reports any critical errors along the
link, directly to the kernel. If the
link is completely disabled, the kernel responds by crashing the system. You must
disable advanced error reporting during full chip configuration, as full chip
configuration brings down the
Enables the advanced error reporting for the
link. Use this option after full chip
configuration to ensure the integrity of the
Prints the contents of a debug ROM within the reference design.
Use for debug purposes.
The example_host_uio module demonstrates the
FPGA device access. This application interacts with each persona, verifying the
contents and functionality of the personas.
The program requires a
device number, followed by optional parameters
of the seed for generating test data, number of tests performed, and verbosity for
displaying extra information.
Table 9. example_host_uio Command-Line
Specifies the seed to use for number
generation. Default value is 1.
Allows you to print additional information
during execution. This option is disabled by default.
Allows you to specify the iterations for the
test you wish to perform. Default value is 3.
Provides help for example_host application.
Note: Running example_host_uio without any
command-line arguments uses seed value of 1, 3 iterations.
Signal Tap Debugging
The reference design supports signal tapping the PR regions through
hierarchical hubs. This feature facilitates the on-chip debugging of a specific
persona. The static region and the parent PR region contain the SLD agent, that
communicates with an SLD host. You must instantiate the SLD host in the persona
that you wish to analyze. You must include the .stp file in the synthesis-only revision of a given persona. Do
not include the signal tap file in the base revision, or the .qsf file of other personas, if the .stp file is specific to a persona.
Follow these guidelines when signal tapping the PR logic:
The reference design software applications are available in the software/util directory. Each application has a
respective sub-directory structure, with a corresponding Makefile.
To build the example application:
To compile the example_host
module, type the following from the Linux shell:
This command generates the executable within the
./example_host_uio -s 1 -n 100 -v
command seeds the input generation with a value of 1, perform 100 iterations, and print
more information on the current status.
126.96.36.199. Programming the Design Using Example Applications
The following steps describe programming your design using the provided
Program the base revision .sof file using the programmer. Power cycle the host PC to allow
link to enumerate. To ensure that
appears as a
type the following from the Linux shell:
To verify the functionality of the design, type the following
from the Linux shell:
To replace the parent PR partition in the design with any of
the following single function PR persona, type the following from the Linux
fpga-configure -p <rbf file from list> 10000
<rbf file from list> is one of the following
Your custom top-level entity must match the ports for the custom_persona
that the source/templates/pr_logic_template.sv
defines. The following example shows interfacing with the
Avalon memory mapped interface via the PCIe register file:
module custom_persona #(
parameter REG_FILE_IO_SIZE = 8
input wire clk,
//active low reset, defined by hardware
input wire rst_n,
//Persona identification register, used by host in host program
output wire [31:0] persona_id,
//Host control register, used for control signals.
input wire [31:0] host_cntrl_register,
// 8 registers for host -> PR logic communication
input wire [31:0] host_pr [0:REG_FILE_IO_SIZE-1],
// 8 Registers for PR logic -> host communication
output wire [31:0] pr_host [0:REG_FILE_IO_SIZE-1]
Utilize any of the parallel I/O port (PIO) register files
for customization. The host_pr register
sends the data from the persona to the host machine. The pr_host register sends the data from the host
machine to the persona.
In your top-level entity file, specify the persona ID as any
assign persona_id = 32'h0000_aeed;
Note: The example template uses only 8 bits, but you can specify any value, up to
Set the unused output ports of the pr_host register to
//Tying unused output ports to zero.
for (i = 2; i < REG_FILE_IO_SIZE; i = i + 1) begin
assign pr_host [i] = 32'b0;
Modify your persona_impl_revision_name.qsf to include the following