Sharing External Memory Bandwidth Using the Multi-Port Front-End Reference Design

This document describes the features and architecture of the Altera® Multi-Port Front-End (MPFE) reference design, details the design flow you should follow to integrate the MPFE block into your design, and illustrates the functionality of the MPFE block in an example system with multiple masters.

The MPFE reference design allows you to efficiently share access to external memory between multiple data masters in your design. Combined with the Altera DDR2 and DDR3 SDRAM Controller with UniPHY, the MPFE block allows efficient access to DDR2 or DDR3 SDRAM memories.

In this document, the terms master and slave are relative to the MPFE reference design, unless specifically referred to as a master in your system. The master port on the MPFE block connects to the slave port on the memory controller. Each slave port on the MPFE connects to the master port on the user logic block in your system that requires access to external memory.

Multi-Port Front-End Block

Multiple functional blocks in a system can share a single external memory interface. Sharing access to the same external memory interface requires an arbiter that manages memory read and write requests from multiple functional blocks. You can use Qsys or SOPC Builder system integration tools to create arbitration logic that uses a weighted round-robin scheme. However, the arbitration logic generated by these tools does not support prioritization of requests from one data master over another.

The MPFE reference design features a priority weighted round-robin arbitration scheme that allows you to control the traffic flow to and from the external memory interface. By defining which ports are critical, you can ensure that the time-critical masters in your system, such as video and audio blocks, have priority over other less time sensitive blocks. It also allows the non-critical masters to use any available bandwidth in times when the critical masters are not requesting access. Systems that contain both high and low priority masters can use the MPFE component to efficiently share memory bandwidth. While the MPFE block is optimized for high data rate applications such as video processing designs, it also supports small, random address accesses such as from a processor.

The MPFE reference design has the following features:

- Multi-class, weighted round-robin arbiter
- Allows masters to be classified as critical and non-critical to protect your time-critical processing blocks
- Allows you to assign bandwidth weights to distribute the memory bandwidth between the masters in your system
Functional Description

Figure 1 shows a block diagram of the MPFE reference design. The block diagram shows the three different types of slave ports that are available: time-critical data slaves, non-critical data slaves, and a debug slave. You can parameterize each of the 16 available data slave ports to be a time-critical or non-critical port.

Figure 1. Block Diagram of the MPFE Reference Design
Each data slave port connects to a master in your system. You can configure the data width of 14 of the 16 ports to match the memory interface local data width. The remaining 2 ports are fixed 32-bit wide width-adapting ports. You can connect these two ports to 32-bit masters such as a Nios® II processor, and they support a burst adaptor that automatically merges multiple 32-bit bursts to a single larger burst on the memory controller interface.

Figure 2 shows an example of how you can connect the MPFE block to a video processing datapath, a direct memory access (DMA) for the on-screen display (OSD), and to a Nios II processor.

**Figure 2. MPFE Block In a Video Processing System**

The MPFE reference design has the following Avalon-MM ports:

- Master port that connects to the memory controller.
- Data slave ports that connect to the masters in the user logic.
- Debug slave port that allows access to internal counters used for tracking performance.
**Master Port**

The master port supports a maximum burst size of 64, which matches the maximum burst size supported by the Avalon-MM slave port of the DDR2 and DDR3 SDRAM controllers. You should always configure the MPFE reference design to have the same maximum burst size as your memory controller to prevent Qsys and SOPC Builder from inserting any burst adaptive logic between them.

**Data Slave Ports**

The data slave ports, with the exception of the width-adapting ports, have the same width as the master port that connects to the memory controller. The slave ports support data widths of 32, 64, 128, 256 and 512 bits. The slave ports do not accept posted writes. The reference design supports up to a maximum of 16 data slave ports.

If your design requires the memory to be shared with more than 16 masters, you can insert a pipeline bridge in the Qsys and SOPC Builder and use the bridge to connect multiple masters to one data slave port on the MPFE reference design.

**Width-Adapting Block**

Data slave ports 6 and 7 have a fixed data width of 32 bits to support fixed-width masters such as the Nios II data and instruction masters. These width-adapting data ports are a power-of-two times wider than the fixed-width master port. For example, a 256-bit wide data port is 8 times wider than the 32-bit fixed-width port.

In a system where the shared slave is 32*n bits wide, each width-adapting port queues up n sequential read requests before presenting a single 32*n-bit wide read request to the arbiter. The read request then only requires a single transaction on the master port to the memory controller interface, instead of n separate requests. When the memory controller slave returns this read data, your system’s master receives the read data over n clock cycles, saving n–1 cycles of the shared memory controller slave’s bandwidth.

In a system with a 256-bit wide memory controller interface, the width-adapting slave queues 8 read requests from the 32-bit fixed-width port and issues a single read request to the memory controller. Without this aggregation, the MPFE component would have to issue 8 separate requests to the memory controller and discard 88% of the returned data.

For writes, a width-adapting port accepts n sequential write requests and allows the user data master to post up to n beats of write data into the port. The width-adapting port then issues the write request to the arbiter and only requires a single transaction on the master port to the memory controller interface.

The width-adapting port has a time-out mechanism to prevent the master from being locked out if it presents less than n read or write requests. The mechanism adds to the best-case latency for these accesses, but on average, the bandwidth savings to the system as a whole offsets the effects of these accesses.
Debug Port
The debug port monitors the performance of the MPFE component and provides information that you can use to optimize your system. This port features an Avalon-MM register slave interface that provides access to transaction counters and latency timers that you can use to assess and tune the performance of your system. You must read the counters and timers in the debug port once every second to prevent them from saturating. You can clear these registers by writing to address 0x0 of the debug slave.

Each slave port has counters that record the following:
- Number of times the port gets access.
- Number of words of read or write data the port receives or sends.
- Worst case latency seen on an access since the last time the counter clears.
- Number of cycles of wait state for the port.

The master port has counters that record the following:
- Number of words of read or write data the port receives or sends.
- Number of cycles of wait state from the slave (memory controller).

Clock Crossing
All the ports in this design are synchronous to the shared memory controller. The slave ports do not have clock crossing logic. Clock crossing logic increases the complexity and the latency of the MPFE component. If your design requires this functionality, implement clock crossing FIFO buffers in your master components or use the Qsys or SOPC Builder tool to automatically insert these blocks.

Arbitration
The MPFE reference design shares access to external memory between the different accesses presented on its slave ports. The arbiter in the MPFE reference design performs the following functions:
- Divides traffic into two classes—critical and non-critical—to protect the time-critical accesses
- Shares the available bandwidth between the slave ports using user-specified bandwidth ratios
Figure 3 shows the arbitration scheme in a system with 10 masters and their corresponding bandwidth allocation settings.

**Figure 3. Arbitration Scheme**

In this arbitration scheme, each data master on the time-critical ring has a share of the available bandwidth based on the settings you specify. The arbiter cycles around the data ports, granting each the ability to issue a read or write burst of up to the largest supported burst size (maximum 64 beats) to the memory controller. The arbiter continues to go around this ring, servicing slave ports that have not exceeded their bandwidth allowance.

Whenever there are no outstanding time-critical requests, the arbiter accepts the next pending transaction from the non-critical ring. Once the non-critical transaction is serviced, the arbiter checks to see if there are any new requests on the time-critical ring and returns to service that request. If more than one request is present on the time-critical ring, the arbiter continues to service them. After the arbiter services all the pending time-critical requests, it switches back to servicing the non-critical ring.

**Sharing Bandwidth**

The MPFE arbiter uses an enhanced version of the weighted round-robin scheme which grants access to successive slave ports in medium sized blocks (up to 64 beats), but has a leaky bucket bandwidth allocation system to distribute the accesses more evenly. Granting access to medium sized blocks, ideally less than or equal to the size of the row or page in the SDRAM memory, reduces the amount of bank management that the memory controller has to do, and distributes accesses across slave ports reduces the worst case latency. Reduced latency means you require less buffering in your design.

The ordering of grants in this system is more difficult to predict than a basic round-robin system. However, the MPFE component supports a debug register slave that enables you to observe the arbitration behavior during the operation of the system.
In the system illustrated in Figure 3, masters M1, M5, and M6 each get 8 shares of the total available bandwidth. The total number of bandwidth shares on the time-critical ring adds up to 31, and hence the M1 master is allocated 25% (8/31) of memory bandwidth. Master M2 gets 4 shares of the available bandwidth, or half of the bandwidth allocated to master M1. The masters M7 through M10 get access to the external memory when the first 6 masters are idle and not requesting memory access.

**Performance**

This section discusses the performance of the MPFE reference design in terms of the memory efficiency, latency, frequency of operation, and resource utilization.

**Memory Efficiency**

The memory interface efficiency you can achieve using the MPFE reference design depends on your traffic patterns. Use functional simulations or the information from the MPFE debug slave port to calculate the efficiency for your system.

For example, in the High Definition Video Reference Design (UDX3) attains an efficiency of more than 90% as measured using the debug port. The High Definition Video Reference Design uses the MPFE block to share access to an external memory with over 14 data masters.

For more information about the UDX3 reference design, refer to *AN 604: High Definition Video Reference Design (UDX3).*

**Latency**

Because the MPFE component adds one extra cycle of latency to any request that is serviced immediately, the minimum command latency through the component is 1 clock cycle. For read transactions, the MPFE component adds an extra three cycles of latency on the data return path; data that is transferred to the slave port through a FIFO buffer. However, this latency does not include the read latency of the memory controller itself.

The worst case latency depends on the number of slave ports and the size of bursts they request. There is one cycle of delay when the slave ports switch. You can measure the latency for a specific request in functional simulation by observing the delay between the time the slave read or write request signal asserts to the time the slave waitrequest signal deasserts and or the read data valid signal asserts.

**Frequency of Operation**

The MPFE reference design operates up to 267 MHz, which matches the maximum core clock frequency of the half-rate 533-MHz DDR3 SDRAM Controller with UniPHY. The timing closure for the MPFE reference design was verified using the Quartus® II design software version 10.1 and targeting a Stratix IV C2 speed grade device (EP4SGX230KF40C2).
Resource Utilization
The MPFE reference design uses approximately 7,500 ALUTs, 5,000 registers, and 34 Kbits of memory when implementing all 16 slave ports with 256-bit wide data buses and the debug slave port. With the optional debug port disabled, the MPFE resource utilization drops down to 4,200 ALUTs and 2,500 registers.

Figure 4 shows a detailed breakdown of the resource utilization for the MPFE reference design.

Figure 4. MPFE Reference Design Resource Utilization

<table>
<thead>
<tr>
<th>Compilation Hierarchy Node</th>
<th>LC Combinational</th>
<th>LC Registers</th>
<th>Block Memory Bits</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>7528 (64)</td>
<td>4903 (298)</td>
<td>35200</td>
</tr>
<tr>
<td>25</td>
<td>1668 (1668)</td>
<td>525 (525)</td>
<td>0</td>
</tr>
<tr>
<td>26</td>
<td>3207 (3207)</td>
<td>2500 (2500)</td>
<td>0</td>
</tr>
<tr>
<td>4</td>
<td>30 (0)</td>
<td>23 (0)</td>
<td>2432</td>
</tr>
<tr>
<td>5</td>
<td>30 (9)</td>
<td>23 (0)</td>
<td>2432</td>
</tr>
<tr>
<td>6</td>
<td>30 (0)</td>
<td>23 (0)</td>
<td>2432</td>
</tr>
<tr>
<td>7</td>
<td>16 (9)</td>
<td>5 (0)</td>
<td>0</td>
</tr>
<tr>
<td>9</td>
<td>7 (7)</td>
<td>7 (7)</td>
<td>0</td>
</tr>
<tr>
<td>10</td>
<td>7 (7)</td>
<td>7 (7)</td>
<td>0</td>
</tr>
<tr>
<td>11</td>
<td>7 (7)</td>
<td>7 (7)</td>
<td>0</td>
</tr>
<tr>
<td>12</td>
<td>0 (0)</td>
<td>0 (0)</td>
<td>2432</td>
</tr>
<tr>
<td>13</td>
<td>0 (0)</td>
<td>0 (0)</td>
<td>2432</td>
</tr>
<tr>
<td>14</td>
<td>31 (9)</td>
<td>24 (0)</td>
<td>32768</td>
</tr>
<tr>
<td>15</td>
<td>31 (2)</td>
<td>24 (1)</td>
<td>32768</td>
</tr>
<tr>
<td>16</td>
<td>29 (1)</td>
<td>23 (0)</td>
<td>32768</td>
</tr>
<tr>
<td>17</td>
<td>14 (7)</td>
<td>1 (0)</td>
<td>32768</td>
</tr>
<tr>
<td>19</td>
<td>7 (7)</td>
<td>7 (7)</td>
<td>0</td>
</tr>
<tr>
<td>20</td>
<td>7 (7)</td>
<td>7 (7)</td>
<td>0</td>
</tr>
<tr>
<td>21</td>
<td>7 (7)</td>
<td>7 (7)</td>
<td>0</td>
</tr>
<tr>
<td>22</td>
<td>0 (0)</td>
<td>0 (0)</td>
<td>32768</td>
</tr>
<tr>
<td>23</td>
<td>0 (0)</td>
<td>0 (0)</td>
<td>32768</td>
</tr>
<tr>
<td>24</td>
<td>1621 (1621)</td>
<td>316 (316)</td>
<td>0</td>
</tr>
<tr>
<td>25</td>
<td>1 (1)</td>
<td>1 (1)</td>
<td>0</td>
</tr>
<tr>
<td>26</td>
<td>1 (1)</td>
<td>1 (1)</td>
<td>0</td>
</tr>
<tr>
<td>27</td>
<td>1 (1)</td>
<td>1 (1)</td>
<td>0</td>
</tr>
<tr>
<td>28</td>
<td>1 (1)</td>
<td>1 (1)</td>
<td>0</td>
</tr>
<tr>
<td>29</td>
<td>1 (1)</td>
<td>1 (1)</td>
<td>0</td>
</tr>
<tr>
<td>30</td>
<td>1 (1)</td>
<td>1 (1)</td>
<td>0</td>
</tr>
<tr>
<td>31</td>
<td>1 (1)</td>
<td>1 (1)</td>
<td>0</td>
</tr>
<tr>
<td>32</td>
<td>1 (1)</td>
<td>1 (1)</td>
<td>0</td>
</tr>
<tr>
<td>33</td>
<td>1 (1)</td>
<td>1 (1)</td>
<td>0</td>
</tr>
<tr>
<td>34</td>
<td>1 (1)</td>
<td>1 (1)</td>
<td>0</td>
</tr>
<tr>
<td>35</td>
<td>1 (1)</td>
<td>1 (1)</td>
<td>0</td>
</tr>
<tr>
<td>36</td>
<td>1 (1)</td>
<td>1 (1)</td>
<td>0</td>
</tr>
<tr>
<td>37</td>
<td>1 (1)</td>
<td>1 (1)</td>
<td>0</td>
</tr>
<tr>
<td>38</td>
<td>1 (1)</td>
<td>1 (1)</td>
<td>0</td>
</tr>
<tr>
<td>39</td>
<td>396 (396)</td>
<td>612 (612)</td>
<td>0</td>
</tr>
<tr>
<td>40</td>
<td>396 (396)</td>
<td>612 (612)</td>
<td>0</td>
</tr>
</tbody>
</table>

Design Flow
This section describes the design flow to implement the MPFE reference design in your system.

Getting Started
You can obtain the MPFE reference design from the MPFE_Reference_Design.zip file provided with this document. The MPFE_Reference_Design.zip file contains all the designs files, and a README.pdf file that provides information for the contents of the mpfe and design_example folders.
Unzip the **MPFE_Reference_Design.zip** file in the working directory you designate for this project. Figure 5 shows the directory structure and contents.

**Figure 5. Directory Structure**

- `<path>` Installation directory.
- `design_example` Contains the design example system files for the MPFE reference design and simulation environment files.
- `mpfe` Contains the component RTL and GUI Tcl files.

The `design_example` folder contains an example system that uses the MPFE reference design to share access to a UniPHY-based DDR3 SDRAM controller with multiple instances of the Traffic Generator and Built-in Self Test (BIST) Engine module. This folder also contains the simulation environment you can use to understand and verify functionality of the MPFE reference design.

**Instantiating the MPFE Reference Design**

You can implement the MPFE reference design using the Qsys or SOPC Builder system integration tool, or as a standalone component in your RTL design.

To instantiate the MPFE reference design, you need to add the MPFE RTL files to your project's Qsys or SOPC Builder library. To add the MPFE reference design, create a folder named `ip` in your project directory and copy the `mpfe` folder to this new `<project_dir>/ip` directory. The MPFE reference design will be visible in the Component Library under the Project>Memories and Memory Controllers category when you restart Qsys or SOPC Builder.

Alternatively, you can add the the MPFE reference design to your Quartus II installation directory for use in multiple projects, or add the MPFE reference design to the Qsys IP Search Path under the Tools>Options menu. With these methods, the MPFE reference design will be visible in the Component Library under the Library>Memories and Memory Controllers category as Multi-Port Front-End when you restart Qsys or SOPC Builder.

**MFPE Reference Design Parameter Settings**

The MPFE parameter editor has the following tabs:

- General Settings
- Bandwidth Settings
- Critical Ports
Figure 6 shows the General Settings tab in the MPFE parameter editor.

**Figure 6. General Settings Tab**

![General Settings Tab](image)

**General Settings**

The General Settings tab allows you to configure the number of ports available, the width of the ports, and the number of address bits for the ports. You can use the MPFE parameter editor to enable the debug slave port.

Table 1 lists the parameters for the General Settings tab.

**Table 1. General Settings Parameters**

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>Number of slave ports required</td>
<td>Choose the number of slave ports to enable. If you require any width adapting ports, you must choose at least 7 because only ports 6 and 7 are width-adapting ports. The width of all the other ports are configurable to match the memory controller data width.</td>
</tr>
<tr>
<td>Maximum supported burst count</td>
<td>Choose the maximum burst count that the MPFE reference design supports. For best results, set the maximum burst count of the arbiter and the SDRAM memory controller to the same value as your masters.</td>
</tr>
<tr>
<td>Enable debug slave port</td>
<td>Add a debug slave port that allows you to read data about the performance of the MPFE. Refer the “Debug Slave Registers” on page 16 for the address map of the debug slave.</td>
</tr>
</tbody>
</table>
Bandwidth Settings

Figure 7 shows the Bandwidth Settings tab in the MPFE parameter editor.

The Bandwidth Settings tab allows you to control the ratio of the bandwidths given to each port. By setting weights for each port, you can restrict the frequency of access allowed for each port to the external memory interface. If you assign larger numbers to a port, that port gets a larger proportion of the external memory bandwidth, while smaller numbers allow a port less bandwidth. The supported bandwidth settings are powers of two, from 1 through 512.

For example, in Figure 7, if slave 0 has a bandwidth setting of 8 and slave 1 has a setting of 4, the arbiter allocates slave 0 approximately twice the bandwidth of slave 1.
**Critical Ports**

Figure 8 shows the **Critical Ports** tab in the MPFE parameter editor.

**Figure 8. Critical Ports Tab**

The **Critical Ports** tab allows you to specify the ports in the system as time-critical or otherwise. The arbiter always grants access to the critical ports over those which are not critical. Each time-critical port gets the share of the external memory bandwidth specified on the **Bandwidth Settings** tab. When the arbiter does not receive any more requests from the time-critical ports, the non-critical ports then gets access to the external memory. The non-critical ports continue getting access until a critical port starts requesting again.

**Configuring and Generating the Qsys System**

To configure and generate the Qsys system, follow these steps:

1. Connect the MPFE reference design ports to your masters and memory controller. To connect the ports, follow these steps:
   a. Connect the master port of the MPFE reference design to the slave port of the memory controller.
   b. Connect the slave ports of the MPFE reference design to the master ports in your design.
   c. Connect the clock and reset output from the memory controller to the MPFE reference design and to your masters.

2. Export the appropriate conduit interfaces. The conduit interfaces include `memory_phy` and `other` on the UniPHY-based memory controller to enable connections to the external memory device.

3. Click the **System Inspector** tab to verify component properties and connections.
4. Check if there are any warning messages regarding address ranges. Select System>Auto-Assign Base Addresses if appropriate.

5. In the Clock Settings tab, specify the PLL reference clock frequency.

6. In the Project Settings tab, set Limit interconnect pipeline stages to 0 if you need to match the SOPC Builder functionality. Otherwise, use the default setting of 1 to achieve higher fabric fMAX.

7. In the Generation tab, turn on the Create Verilog simulation model option and verify the output directory path.

Currently Qsys does not support the creation of a testbench system or generic memory model in the Quartus II software version 10.1, and you must manually create a testbench for functional simulation. The design example provided with the MPFE reference design includes an example testbench and memory model for your reference.

8. Select Generate.

Using Standalone RTL Flow

To instantiate the MPFE reference design in a standalone design, instantiate the mpfe_top module in your design and add all the component RTL files to your project. However, there are a few key RTL modifications required for this design flow. To modify the RTL flow, follow these steps:

1. Use the RTL parameter settings of the mpfe_top module to specify the data width, address width, bandwidth and critical port settings.

2. Connect the mst_* signals on the MPFE reference design to the avl_* signals of the DDR3 memory controller. You must connect the mst_burstcount signal to the avl_size signal.

3. Invert the avl_ready output from the memory controller and connect it to the mst_waitrequest signal.

You must invert the avl_ready output because the polarity of the signals is reversed.

4. Edit the mpfe_top.v module to convert the mst_burst_begin signal from an internal wire to an output port, and connect the signal to the avl_burstbegin port on the memory controller. When implemented in Qsys or SOPC Builder, the interconnect fabric generates the burstbegin signal. However, in the RTL flow, the MPFE reference design master port needs to generate the burstbegin signal for the memory controller.

5. Connect the MPFE reference design ports to your masters and memory controller.
Simulating the System

To simulate your system, you must create a testbench for functional simulation.

1. Instantiate the Qsys system, or top level RTL module.
2. Instantiate the memory model. You can obtain the memory model file from the memory vendor.

   Qsys currently does not generate a generic memory model. Use the vendor model, or obtain a generic model by using the MegaWizard™ Plug-In Manager flow for UniPHY-based DDR2 and DDR3 controllers.

3. Create clock and reset signals for the system.
4. Edit the Qsys-generated simulation setup script to include the testbench and memory model files, and load the testbench system.

   The simulation only supports Verilog HDL language and the script generated for ModelSim®.

For more information, refer to the testbench file, my_qsys_mpfe_system_tb.v, and simulation script, mti_setup.tcl, provided with the design example. These files are located in the design_example\my_qsys_mpfe_system\sim_verilog folder.

Using the Design Example

The design example provided with this document uses the MPFE reference design to share the external external memory access between four data masters. The MPFE reference design implements the data masters using four separate instances of Altera’s Traffic Generator and BIST Engine megafunction. These data masters model access patterns from different functional blocks of a typical system.
Figure 9 shows the example system contents and connections in Qsys.

In Figure 9, masters 0 and 1 are critical data masters in the system, and masters 2 and 3 are non-critical masters.

The bandwidth settings for masters 0 and 1 are 8 and 4, which allocate 67% of the memory bandwidth to master 0, and the remaining 33% of bandwidth to master 1. With these settings, the MPFE arbiter typically processes two requests from master 0 for each request from master 1. Because masters 2 and 3 are non-critical ports, the MPFE arbiter only processes requests from these masters when there are no pending requests from masters 0 and 1. You can observe the operation of the MPFE arbiter during functional simulation (Transcript window messages), and in the hardware (mpfe/arbiter/arb_grant signals).

Refer to the README.pdf file for more information.

You can use the design example to understand the implementation and functionality of the MPFE reference design.

To examine the implementation of the MPFE reference design, follow these steps:

1. Open the Quartus II project file in the design_example folder.
2. Open the my_qsys_mpfe_system.qsys file from the File >Open or Tools >Qys menu selection.
3. View or edit the MPFE reference design settings by selecting mpfe from the System Contents tab.

---

Figure 9. Example System Contents and Connections in Qsys

<table>
<thead>
<tr>
<th>System Contents</th>
<th>System Inspector</th>
<th>Address Map</th>
<th>Clock Settings</th>
<th>Project Settings</th>
<th>Generation</th>
<th>HLS Example</th>
</tr>
</thead>
<tbody>
<tr>
<td>Use Connections</td>
<td>Module</td>
<td>Description</td>
<td>Expert Args</td>
<td>Clock</td>
<td>Slave</td>
<td>End</td>
</tr>
<tr>
<td>clk</td>
<td>clk</td>
<td>Clock Source</td>
<td>clk to export</td>
<td>clk</td>
<td></td>
<td></td>
</tr>
<tr>
<td>clk_in</td>
<td>clk_in</td>
<td>Clock Input</td>
<td>clk to export</td>
<td>clk</td>
<td></td>
<td></td>
</tr>
<tr>
<td>clk_out</td>
<td>clk_out</td>
<td>Clock Output</td>
<td>clk to export</td>
<td>clk</td>
<td></td>
<td></td>
</tr>
<tr>
<td>simpleb_dpr3</td>
<td>simpleb_dpr3</td>
<td>DDR DRAM Controller with UniPHY (Netw)</td>
<td>clk to export</td>
<td>simpleb_dpr3_clk_source</td>
<td></td>
<td></td>
</tr>
<tr>
<td>clock_dpr3</td>
<td>clock_dpr3</td>
<td>Clock Input</td>
<td>clk to export</td>
<td>clk</td>
<td></td>
<td></td>
</tr>
<tr>
<td>clock_source</td>
<td>clock_source</td>
<td>Clock Output</td>
<td>clk to export</td>
<td>clk</td>
<td></td>
<td></td>
</tr>
<tr>
<td>halp_dpr3</td>
<td>halp_dpr3</td>
<td>DDR DRAM Controller with UniPHY (Netw)</td>
<td>clk to export</td>
<td>simpleb_dpr3_clock_source</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Other</td>
<td>Other</td>
<td>Consult</td>
<td>clk to export</td>
<td>clk</td>
<td></td>
<td></td>
</tr>
<tr>
<td>msiвеличболо</td>
<td>msiвеличболо</td>
<td>Consult</td>
<td>clk to export</td>
<td>clk</td>
<td></td>
<td></td>
</tr>
<tr>
<td>llp_parking</td>
<td>llp_parking</td>
<td>Consult</td>
<td>clk to export</td>
<td>clk</td>
<td></td>
<td></td>
</tr>
<tr>
<td>DU_parking</td>
<td>DU_parking</td>
<td>Consult</td>
<td>clk to export</td>
<td>clk</td>
<td></td>
<td></td>
</tr>
<tr>
<td>OCT_parking</td>
<td>OCT_parking</td>
<td>Consult</td>
<td>clk to export</td>
<td>clk</td>
<td></td>
<td></td>
</tr>
<tr>
<td>amba_slave_0</td>
<td>amba_slave_0</td>
<td>Amba/Memory Mapped Slave</td>
<td>clk to export</td>
<td>simpleb_dpr3_clock_source</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Refer to the README.pdf file for more information.
4. You can modify bandwidth settings and critical port settings, regenerate the system, and use simulation to examine MPFE functionality.

To simulate the design example in ModelSim:
1. Open ModelSim and change the working directory to \design_example\my_qsys_mpfe_system\sim_verilog.
2. Execute the mti_setup.tcl script from the Transcript window.
3. Load the design, open the waveform viewer, and run the simulation by executing the following commands in the Transcript window:
   ```
   ld
   do wave.do
   run -all
   ```
4. Examine results in the Waveform and Transcript windows.

### Debug Slave Registers

Table 2 lists the registers tracking the MPFE master ports.

**Table 2. Master Registers**

<table>
<thead>
<tr>
<th>Register</th>
<th>Address</th>
<th>Size</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>Clear counters</td>
<td>0×0</td>
<td>32 bits</td>
<td>Write to this address to clear all the counters.</td>
</tr>
<tr>
<td>Master wait count</td>
<td>0×1</td>
<td>32 bits</td>
<td>The number of cycles that the slaves have wait-stated for the master.</td>
</tr>
<tr>
<td>Master write beats count</td>
<td>0×2</td>
<td>32 bits</td>
<td>The number of words of write data that the master has sent.</td>
</tr>
<tr>
<td>Master read beats count</td>
<td>0×3</td>
<td>32 bits</td>
<td>The number of words of read data that the master has received.</td>
</tr>
</tbody>
</table>

Table 3 lists registers tracking the MPFE slave ports.

**Table 3. Slave Registers**

<table>
<thead>
<tr>
<th>Register</th>
<th>Address</th>
<th>Size</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>Per-slave grant count</td>
<td>Slave offset + 0×0</td>
<td>32 bits</td>
<td>The number of times that this slave has been granted access.</td>
</tr>
<tr>
<td>Per-slave write count</td>
<td>Slave offset + 0×1</td>
<td>32 bits</td>
<td>The number of words of write data that this slave has sent.</td>
</tr>
<tr>
<td>Per-slave read count</td>
<td>Slave offset + 0×2</td>
<td>32 bits</td>
<td>The number of words of read data that this slave has received.</td>
</tr>
<tr>
<td>Per-slave worst wait</td>
<td>Slave offset + 0×3</td>
<td>10 bits</td>
<td>The worst number of cycles that this slave has to wait between requesting and being granted.</td>
</tr>
<tr>
<td>Per-slave total wait count</td>
<td>Slave offset + 0×4</td>
<td>32 bits</td>
<td>The total number of cycles that this slave has to wait between requesting being and granted.</td>
</tr>
</tbody>
</table>
This set of registers is the same for each slave in your system. To access the registers for any given slave, use the following formula to calculate the per-slave offset:

\[
\text{Per-slave offset} = 0 \times 10 + (0 \times 8 \times \text{slave number})
\]

Use the following formula to calculate the required address for a particular register:

\[
\text{Address} = 0 \times 10 + (0 \times 8 \times \text{slave number}) + \text{register address}
\]

For example, to access the worst-case wait count (0x3) for slave 4:

\[
\text{Address} = 0 \times 10 + (0 \times 8 \times 4) + 0 \times 3 = 0 \times 33
\]

\[\text{Note}\]
The addresses listed are word addresses. Convert the addresses to byte addresses (multiply by 4), if you are reading them using Nios II console or System Console.

### Document Revision History

Table 4 shows the revision history for this document.

<table>
<thead>
<tr>
<th>Date</th>
<th>Version</th>
<th>Changes</th>
</tr>
</thead>
<tbody>
<tr>
<td>January 2011</td>
<td>1.0</td>
<td>Initial release.</td>
</tr>
</tbody>
</table>