Arria V Device Handbook: Volume 1: Device Interfaces and Integration
1. Logic Array Blocks and Adaptive Logic Modules in Arria V Devices
This chapter describes the features of the logic array block (LAB) in the Arria® V core fabric.
The LAB is composed of basic building blocks known as adaptive logic modules (ALMs) that you can configure to implement logic functions, arithmetic functions, and register functions.
You can use a quarter of the available LABs in the Arria® V devices as a memory LAB (MLAB).
The Intel® Quartus® Prime software and other supported third-party synthesis tools, in conjunction with parameterized functions such as the library of parameterized modules (LPM), automatically choose the appropriate mode for common functions such as counters, adders, subtractors, and arithmetic functions.
This chapter contains the following sections:
- LAB
- ALM Operating Modes
1.1. LAB
The LABs are configurable logic blocks that consist of a group of logic resources. Each LAB contains dedicated logic for driving control signals to its ALMs.
MLAB is a superset of the LAB and includes all the LAB features.
1.1.1. MLAB
Each MLAB supports a maximum of 640 bits of simple dual-port SRAM.
You can configure each ALM in an MLAB in the following configurations:
- A 32 x 2 memory block, resulting in a configuration of 32 x 20 simple dual-port SRAM block for Arria® V GX, GT, SX, and ST devices
- Either a 64 × 1 or a 32 × 2 block, resulting in a configuration of either a 64 × 10 or a 32 × 20 simple dual-port SRAM block for Arria® V GZ devices
1.1.2. Local and Direct Link Interconnects
Each LAB can drive 30 ALMs through fast-local and direct-link interconnects. Ten ALMs are in any given LAB and ten ALMs are in each of the adjacent LABs.
The local interconnect can drive ALMs in the same LAB using column and row interconnects and ALM outputs in the same LAB.
Neighboring LABs, MLABs, M20K and M10K blocks, or digital signal processing (DSP) blocks from the left or right can also drive the LAB’s local interconnect using the direct link connection.
The direct link connection feature minimizes the use of row and column interconnects, providing higher performance and flexibility.
1.1.3. LAB Control Signals
Each LAB contains dedicated logic for driving the control signals to its ALMs, and has two unique clock sources and three clock enable signals.
The LAB control block generates up to three clocks using the two clock sources and three clock enable signals. An inverted clock source is considered as an individual clock source. Each clock and the clock enable signals are linked.
De-asserting the clock enable signal turns off the corresponding LAB-wide clock.
1.1.4. ALM Resources
One ALM contains four programmable registers. Each register has the following ports:
- Data
- Clock
- Synchronous and asynchronous clear
- Synchronous load
Global signals, general-purpose I/O (GPIO) pins, or any internal logic can drive the clock and clear control signals of an ALM register.
GPIO pins or internal logic drives the clock enable signal.
For combinational functions, the registers are bypassed and the output of the look-up table (LUT) drives directly to the outputs of an ALM.
1.1.5. ALM Output
The general routing outputs in each ALM drive the local, row, and column routing resources. Two ALM outputs can drive column, row, or direct link routing connections, and one of these ALM outputs can also drive local interconnect resources.
The LUT, adder, or register output can drive the ALM outputs. The LUT or adder can drive one output while the register drives another output.
Register packing improves device utilization by allowing unrelated register and combinational logic to be packed into a single ALM. Another mechanism to improve fitting is to allow the register output to feed back into the look-up table (LUT) of the same ALM so that the register is packed with its own fan-out LUT. The ALM can also drive out registered and unregistered versions of the LUT or adder output.
1.2. ALM Operating Modes
The Arria® V ALM operates in any of the following modes:
- Normal mode
- Extended LUT mode
- Arithmetic mode
- Shared arithmetic mode
1.2.1. Normal Mode
Normal mode allows two functions to be implemented in one Arria® V ALM, or a single function of up to six inputs.
Up to eight data inputs from the LAB local interconnect are inputs to the combinational logic.
The ALM can support certain combinations of completely independent functions and various combinations of functions that have common inputs.
1.2.2. Extended LUT Mode
In this mode, if the 7-input function is unregistered, the unused eighth input is available for register packing.
Functions that fit into the template, as shown in the following figure, often appear in designs as “if-else” statements in Verilog HDL or VHDL code.
1.2.3. Arithmetic Mode
The ALM in arithmetic mode uses two sets of two 4-input LUTs along with two dedicated full adders.
The dedicated adders allow the LUTs to perform pre-adder logic; therefore, each adder can add the output of two 4-input functions.
The ALM supports simultaneous use of the adder’s carry output along with combinational logic outputs. The adder output is ignored in this operation.
Using the adder with the combinational logic output provides resource savings of up to 50% for functions that can use this mode.
Carry Chain
The carry chain provides a fast carry function between the dedicated adders in arithmetic or shared arithmetic mode.
The two-bit carry select feature in Arria® V devices halves the propagation delay of carry chains within the ALM. Carry chains can begin in either the first ALM or the fifth ALM in a LAB. The final carry-out signal is routed to an ALM, where it is fed to local, row, or column interconnects.
To avoid routing congestion in one small area of the device when a high fan-in arithmetic function is implemented, the LAB can support carry chains that only use either the top half or bottom half of the LAB before connecting to the next LAB. This leaves the other half of the ALMs in the LAB available for implementing narrower fan-in functions in normal mode. Carry chains that use the top five ALMs in the first LAB carry into the top half of the ALMs in the next LAB in the column. Carry chains that use the bottom five ALMs in the first LAB carry into the bottom half of the ALMs in the next LAB within the column. You can bypass the top-half of the LAB columns and bottom-half of the MLAB columns.
The Intel® Quartus® Prime Compiler creates carry chains longer than 20 ALMs (10 ALMs in arithmetic or shared arithmetic mode) by linking LABs together automatically. For enhanced fitting, a long carry chain runs vertically, allowing fast horizontal connections to the TriMatrix memory and DSP blocks. A carry chain can continue as far as a full column.
1.2.4. Shared Arithmetic Mode
The ALM in shared arithmetic mode can implement a 3-input add in the ALM.
This mode configures the ALM with four 4-input LUTs. Each LUT either computes the sum of three inputs or the carry of three inputs. The output of the carry computation is fed to the next adder using a dedicated connection called the shared arithmetic chain.
Shared Arithmetic Chain
The shared arithmetic chain available in enhanced arithmetic mode allows the ALM to implement a 3-input adder. This significantly reduces the resources necessary to implement large adder trees or correlator functions.
The shared arithmetic chain can begin in either the first or sixth ALM in a LAB.
Similar to carry chains, the top and bottom half of the shared arithmetic chains in alternate LAB columns can be bypassed. This capability allows the shared arithmetic chain to cascade through half of the ALMs in an LAB while leaving the other half available for narrower fan-in functionality. In every LAB, the column is top-half bypassable; while in MLAB, columns are bottom-half bypassable.
The Intel® Quartus® Prime Compiler creates shared arithmetic chains longer than 20 ALMs (10 ALMs in arithmetic or shared arithmetic mode) by linking LABs together automatically. To enhance fitting, a long shared arithmetic chain runs vertically, allowing fast horizontal connections to the TriMatrix memory and DSP blocks. A shared arithmetic chain can continue as far as a full column.
1.3. Logic Array Blocks and Adaptive Logic Modules in Arria V Devices Revision History
Date | Version | Changes |
---|---|---|
December 2016 | 2016.12.09 | Added description on clock source in the LAB Control Signals section. |
December 2015 | 2015.12.21 | Changed instances of Quartus II to Quartus Prime. |
January 2014 | 2014.01.10 | Added multiplexers for the bypass paths
and register outputs in the following diagrams:
|
May 2013 | 2013.05.06 |
|
November 2012 | 2012.11.19 |
|
June 2012 | 2.0 |
Updated for the Quartus II software v12.0 release:
|
November 2011 | 1.1 | Restructured chapter. |
May 2011 | 1.0 | Initial release. |
2. Embedded Memory Blocks in Arria V Devices
The embedded memory blocks in the devices are flexible and designed to provide an optimal amount of small- and large-sized memory arrays to fit your design requirements.
2.1. Types of Embedded Memory
The Arria® V devices contain two types of memory blocks:
- 20 Kb M20K or 10 Kb M10K blocks—blocks of dedicated memory resources. The M20K and M10K blocks are ideal for larger memory arrays while still providing a large number of independent ports.
- 640 bit memory logic array blocks (MLABs)—enhanced memory blocks that are configured from dual-purpose logic array blocks (LABs). The MLABs are ideal for wide and shallow memory arrays. The MLABs are optimized for implementation of shift registers for digital signal processing (DSP) applications, wide shallow FIFO buffers, and filter delay lines. Each MLAB is made up of ten adaptive logic modules (ALMs). In the Arria® V devices, you can configure these ALMs as ten 32 x 2 blocks, giving you one 32 x 20 simple dual-port SRAM block per MLAB. You can also configure these ALMs, in Arria® V GZ devices, as ten 64 x 1 blocks, giving you one 64 x 10 simple dual-port SRAM block per MLAB.
2.1.1. Embedded Memory Capacity in Arria V Devices
Variant | Member Code | M20K | M10K | MLAB | Total RAM Bit (Kb) | |||
---|---|---|---|---|---|---|---|---|
Block | RAM Bit (Kb) | Block | RAM Bit (Kb) | Block | RAM Bit (Kb) | |||
Arria V GX | A1 | — | — | 800 | 8,000 | 741 | 463 | 8,463 |
A3 | — | — | 1,051 | 10,510 | 1538 | 961 | 11,471 | |
A5 | — | — | 1,180 | 11,800 | 1877 | 1,173 | 12,973 | |
A7 | — | — | 1,366 | 13,660 | 2317 | 1,448 | 15,108 | |
B1 | — | — | 1,510 | 15,100 | 2964 | 1,852 | 16,952 | |
B3 | — | — | 1,726 | 17,260 | 3357 | 2,098 | 19,358 | |
B5 | — | — | 2,054 | 20,540 | 4052 | 2,532 | 23,072 | |
B7 | — | — | 2,414 | 24,140 | 4650 | 2,906 | 27,046 | |
Arria V GT | C3 | — | — | 1,051 | 10,510 | 1538 | 961 | 11,471 |
C7 | — | — | 1,366 | 13,660 | 2317 | 1,448 | 15,108 | |
D3 | — | — | 1,726 | 17,260 | 3357 | 2,098 | 19,358 | |
D7 | — | — | 2,414 | 24,140 | 4650 | 2,906 | 27,046 | |
Arria V GZ | E1 | 585 | 11,700 | — | — | 4,151 | 2,594 | 14,294 |
E3 | 957 | 19,140 | — | — | 6,792 | 4,245 | 23,385 | |
E5 | 1,440 | 28,800 | — | — | 7,548 | 4,718 | 33,518 | |
E7 | 1,700 | 34,000 | — | — | 8,490 | 5,306 | 39,306 | |
Arria V SX | B3 | — | — | 1,729 | 17,290 | 3223 | 2,014 | 19,304 |
B5 | — | — | 2,282 | 22,820 | 4253 | 2,658 | 25,478 | |
Arria V ST | D3 | — | — | 1,729 | 17,290 | 3223 | 2,014 | 19,304 |
D5 | — | — | 2,282 | 22,820 | 4253 | 2,658 | 25,478 |
2.2. Embedded Memory Design Guidelines for Arria V Devices
There are several considerations that require your attention to ensure the success of your designs. Unless noted otherwise, these design guidelines apply to all variants of this device family.
2.2.1. Guideline: Consider the Memory Block Selection
The Intel® Quartus® Prime software automatically partitions the user-defined memory into the memory blocks based on your design's speed and size constraints. For example, the Intel® Quartus® Prime software may spread out the memory across multiple available memory blocks to increase the performance of the design.
To assign the memory to a specific block size manually, use the RAM IP core in the IP Catalog.
For the memory logic array blocks (MLAB), you can implement single-port SRAM through emulation using the Intel® Quartus® Prime software. Emulation results in minimal additional use of logic resources.
Because of the dual-purpose architecture of the MLAB, only data input and output registers are available in the block. The MLABs gain read address registers from the ALMs. However, the write address and read data registers are internal to the MLABs.
2.2.2. Guideline: Implement External Conflict Resolution
In the true dual-port RAM mode, you can perform two write operations to the same memory location. However, the memory blocks do not have internal conflict resolution circuitry. To avoid unknown data being written to the address, implement external conflict resolution logic to the memory block.
2.2.3. Guideline: Customize Read-During-Write Behavior
Customize the read-during-write behavior of the memory blocks to suit your design requirements.
2.2.3.1. Same-Port Read-During-Write Mode
The same-port read-during-write mode applies to a single-port RAM or the same port of a true dual-port RAM.
Output Mode | Memory Type | Description |
---|---|---|
"new data" (flow-through) |
M20K , M10K | The new data is available on the rising edge of the same clock cycle on which the new data is written. |
"don't care" | M10K, MLAB | The RAM outputs "don't care" values for a read-during-write operation. |
2.2.3.2. Mixed-Port Read-During-Write Mode
The mixed-port read-during-write mode applies to simple and true dual-port RAM modes where two ports perform read and write operations on the same memory address using the same clock—one port reading from the address, and the other port writing to it.
Output Mode | Memory Type | Description |
---|---|---|
"new data" | MLAB |
A read-during-write operation to different ports causes the MLAB registered output to reflect the “new data” on the next rising edge after the data is written to the MLAB memory. This mode is available only if the output is registered. |
"old data" | M20K, M10K, MLAB |
A read-during-write operation to different ports causes the RAM output to reflect the “old data” value at the particular address. For MLAB, this mode is available only if the output is registered. |
"don't care" | M20K, M10K, MLAB |
The RAM outputs “don’t care” or “unknown” value.
|
"constrained don't care" | MLAB |
The RAM outputs “don’t care” or “unknown” value. The Intel® Quartus® Prime software analyzes the timing between write and read operations in the MLAB. |
In the dual-port RAM mode, the mixed-port read-during-write operation is supported if the input registers have the same clock. The output value during the operation is “unknown.”
2.2.4. Guideline: Consider Power-Up State and Memory Initialization
Consider the power up state of the different types of memory blocks if you are designing logic that evaluates the initial power-up values, as listed in the following table.
Memory Type | Output Registers | Power Up Value |
---|---|---|
MLAB | Used | Zero (cleared) |
Bypassed | Read memory contents | |
M20K , M10K | Used | Zero (cleared) |
Bypassed | Zero (cleared) |
By default, the Intel® Quartus® Prime software initializes the RAM cells in Arria® V devices to zero unless you specify a .mif.
All memory blocks support initialization with a .mif. You can create .mif files in the Intel® Quartus® Prime software and specify their use with the RAM IP core when you instantiate a memory in your design. Even if a memory is pre-initialized (for example, using a .mif), it still powers up with its output cleared.
2.2.5. Guideline: Control Clocking to Reduce Power Consumption
Reduce AC power consumption in your design by controlling the clocking of each memory block:
- Use the read-enable signal to ensure that read operations occur only when necessary. If your design does not require read-during-write, you can reduce your power consumption by de-asserting the read-enable signal during write operations, or during the period when no memory operations occur.
- Use the Intel® Quartus® Prime software to automatically place any unused memory blocks in low-power mode to reduce static power.
2.3. Embedded Memory Features
Features | M20K , M10K | MLAB |
---|---|---|
Maximum operating frequency |
|
|
Capacity per block (including parity bits) |
|
640 |
Parity bits | Supported | Supported |
Byte enable | Supported | Supported |
Packed mode | Supported | — |
Address clock enable | Supported | Supported |
Simple dual-port mixed width | Supported | — |
True dual-port mixed width | Supported | — |
FIFO buffer mixed width | Supported | — |
Memory Initialization File (.mif) | Supported | Supported |
Mixed-clock mode | Supported | Supported |
Fully synchronous memory | Supported | Supported |
Asynchronous memory | — | Only for flow-through read memory operations. |
Power-up state |
Output ports are cleared. |
|
Asynchronous clears | Output registers and output latches | Output registers and output latches |
Write/read operation triggering | Rising clock edges | Rising clock edges |
Same-port read-during-write |
(The "don't care" mode applies only for the single-port RAM mode). |
Output ports set to "don't care". |
Mixed-port read-during-write | Output ports set to "old data" or "don't care". | Output ports set to "old data", "new data", "don't care", or "constrained don't care". |
ECC support |
Soft IP support using the Intel® Quartus® Prime software. Built-in support in x32-wide simple dual-port mode (M20K only). |
Soft IP support using the Intel® Quartus® Prime software. |
2.3.1. Embedded Memory Configurations
Memory Block | Depth (bits) | Programmable Width |
---|---|---|
MLAB | 32 | x16, x18, or x20 |
641 | x10 | |
M20K | 512 | x40 |
1K | x20 | |
2K | x10 | |
4K | x5 | |
8K | x2 | |
16K | x1 | |
M10K | 256 | x40 or x32 |
512 | x20 or x16 | |
1K | x10 or x8 | |
2K | x5 or x4 | |
4K | x2 | |
8K | x1 |
2.3.2. Mixed-Width Port Configurations
The mixed-width port configuration is supported in the simple dual-port RAM and true dual-port RAM memory modes.
2.3.2.1. M20K Blocks Mixed-Width Configurations
The following table lists the mixed-width configurations of the M20K blocks in the simple dual-port RAM mode.
Read Port | Write Port | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
16K x 1 | 8K x 2 | 4K x 4 | 4K x 5 | 2K x 8 | 2K x 10 | 1K x 16 | 1K x 20 | 512 x 32 | 512 x 40 | |
16K x 1 | Yes | Yes | Yes | — | Yes | — | Yes | — | Yes | — |
8K x 2 | Yes | Yes | Yes | — | Yes | — | Yes | — | Yes | — |
4K x 4 | Yes | Yes | Yes | — | Yes | — | Yes | — | Yes | — |
4K x 5 | — | — | — | Yes | — | Yes | — | Yes | — | Yes |
2K x 8 | Yes | Yes | Yes | — | Yes | — | Yes | — | Yes | — |
2K x 10 | — | — | — | Yes | — | Yes | — | Yes | — | Yes |
1K x 16 | Yes | Yes | Yes | — | Yes | — | Yes | — | Yes | — |
1K x 20 | — | — | — | Yes | — | Yes | — | Yes | — | Yes |
512 x 32 | Yes | Yes | Yes | — | Yes | — | Yes | — | Yes | — |
512 x 40 | — | — | — | Yes | — | Yes | — | Yes | — | Yes |
The following table lists the mixed-width configurations of the M20K blocks in true dual-port mode.
Port A | Port B | |||||||
---|---|---|---|---|---|---|---|---|
16K x 1 | 8K x 2 | 4K x 4 | 4K x 5 | 2K x 8 | 2K x 10 | 1K x 16 | 1K x 20 | |
16K x 1 | Yes | Yes | Yes | — | Yes | — | Yes | — |
8K x 2 | Yes | Yes | Yes | — | Yes | — | Yes | — |
4K x 4 | Yes | Yes | Yes | — | Yes | — | Yes | — |
4K x 5 | — | — | — | Yes | — | Yes | — | Yes |
2K x 8 | Yes | Yes | Yes | — | Yes | — | Yes | — |
2K x 10 | — | — | — | Yes | — | Yes | — | Yes |
1K x 16 | Yes | Yes | Yes | — | Yes | — | Yes | — |
1K x 20 | — | — | — | Yes | — | Yes | — | Yes |
2.3.2.2. M10K Blocks Mixed-Width Configurations
Read Port | Write Port | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
8K x 1 | 4K x 2 | 2K x 4 | 2K x 5 | 1K x 8 | 1k x 10 | 512 x 16 | 512 x 20 | 256 x 32 | 256 x 40 | |
8K x 1 | Yes | Yes | Yes | — | Yes | — | Yes | — | Yes | — |
4K x 2 | Yes | Yes | Yes | — | Yes | — | Yes | — | Yes | — |
2K x 4 | Yes | Yes | Yes | — | Yes | — | Yes | — | Yes | — |
2K x 5 | — | — | — | Yes | — | Yes | — | Yes | — | Yes |
1K x 8 | Yes | Yes | Yes | — | Yes | — | Yes | — | Yes | — |
1K x 10 | — | — | — | Yes | — | Yes | — | Yes | — | Yes |
512 x 16 | Yes | Yes | Yes | — | Yes | — | Yes | — | Yes | — |
512 x 20 | — | — | — | Yes | — | Yes | — | Yes | — | Yes |
256 x 32 | Yes | Yes | Yes | — | Yes | — | Yes | — | Yes | — |
256 x 40 | — | — | — | Yes | — | Yes | — | Yes | — | Yes |
Port B | Port A | |||||||
---|---|---|---|---|---|---|---|---|
8K x 1 | 4K x 2 | 2K x 4 | 2K x 5 | 1K x 8 | 1K x 10 | 512 x 16 | 512 x 20 | |
8K x 1 | Yes | Yes | Yes | — | Yes | — | Yes | — |
4K x 2 | Yes | Yes | Yes | — | Yes | — | Yes | — |
2K x 4 | Yes | Yes | Yes | — | Yes | — | Yes | — |
2K x 5 | — | — | — | Yes | — | Yes | — | Yes |
1K x 8 | Yes | Yes | Yes | — | Yes | — | Yes | — |
1K x 10 | — | — | — | Yes | — | Yes | — | Yes |
512 x 16 | Yes | Yes | Yes | — | Yes | — | Yes | — |
512 x 20 | — | — | — | Yes | — | Yes | — | Yes |
2.4. Embedded Memory Modes
Memory Mode | M20K and M10K Support | MLAB Support | Description |
---|---|---|---|
Single-port RAM | Yes | Yes |
You can perform only one read or one write operation at a time. Use the read enable port to control the RAM output ports behavior during a write operation:
|
Simple dual-port RAM | Yes | Yes |
You can simultaneously perform one read and one write operations to different locations where the write operation happens on port A and the read operation happens on port B. |
True dual-port RAM | Yes | — |
You can perform any combination of two port operations: two reads, two writes, or one read and one write at two different clock frequencies. |
Shift-register | Yes | Yes |
You can use the memory blocks as a shift-register block to save logic cells and routing resources. This is useful in DSP applications that require local data storage such as finite impulse response (FIR) filters, pseudo-random number generators, multi-channel filtering, and auto- and cross- correlation functions. Traditionally, the local data storage is implemented with standard flip-flops that exhaust many logic cells for large shift registers. The input data width (w), the length of the taps (m), and the number of taps (n) determine the size of a shift register (w × m × n). You can cascade memory blocks to implement larger shift registers. |
ROM | Yes | Yes |
You can use the memory blocks as ROM.
|
FIFO | Yes | Yes |
You can use the memory blocks as FIFO buffers. Use the SCFIFO and DCFIFO IP cores to implement single- and dual-clock asynchronous FIFO buffers in your design. For designs with many small and shallow FIFO buffers, the MLABs are ideal for the FIFO mode. However, the MLABs do not support mixed-width FIFO mode. |
2.5. Embedded Memory Clocking Modes
This section describes the clocking modes for the Arria® V memory blocks.
2.5.1. Clocking Modes for Each Memory Mode
Clocking Mode | Memory Mode | ||||
---|---|---|---|---|---|
Single-Port | Simple Dual-Port | True Dual-Port | ROM | FIFO | |
Single clock mode | Yes | Yes | Yes | Yes | Yes |
Read/write clock mode | — | Yes | — | — | Yes |
Input/output clock mode | Yes | Yes | Yes | Yes | — |
Independent clock mode | — | — | Yes | Yes | — |
2.5.1.1. Single Clock Mode
In the single clock mode, a single clock, together with a clock enable, controls all registers of the memory block.
2.5.1.2. Read/Write Clock Mode
In the read/write clock mode, a separate clock is available for each read and write port. A read clock controls the data-output, read-address, and read-enable registers. A write clock controls the data-input, write-address, write-enable, and byte enable registers.
2.5.1.3. Input/Output Clock Mode
In input/output clock mode, a separate clock is available for each input and output port. An input clock controls all registers related to the data input to the memory block including data, address, byte enables, read enables, and write enables. An output clock controls the data output registers.
2.5.1.4. Independent Clock Mode
In the independent clock mode, a separate clock is available for each port (A and B). Clock A controls all registers on the port A side; clock B controls all registers on the port B side.
2.5.2. Asynchronous Clears in Clocking Modes
In all clocking modes, asynchronous clears are available only for output latches and output registers. For the independent clock mode, this is applicable on both ports.
2.5.3. Output Read Data in Simultaneous Read/Write
If you perform a simultaneous read/write to the same address location using the read/write clock mode, the output read data is unknown. If you require the output read data to be a known value, use single-clock or input/output clock mode and select the appropriate read-during-write behavior in the IP Catalog.
2.5.4. Independent Clock Enables in Clocking Modes
Independent clock enables are supported in the following clocking modes:
- Read/write clock mode—supported for both the read and write clocks.
- Independent clock mode—supported for the registers of both ports.
To save power, you can control the shut down of a particular register using the clock enables.
2.6. Parity Bit in Memory Blocks
M20K , M10K | MLAB |
---|---|
|
|
2.7. Byte Enable in Embedded Memory Blocks
The embedded memory blocks support byte enable controls:
- The byte enable controls mask the input data so that only specific bytes of data are written. The unwritten bytes retain the values written previously.
- The write enable (wren) signal, together with the byte enable (byteena) signal, control the write operations on the RAM blocks. By default, the byteena signal is high (enabled) and only the wren signal controls the writing.
- The byte enable registers do not have a clear port.
- If you are using parity bits, on the M20K and M10K blocks, the byte enable function controls 8 data bits and 2 parity bits; on the MLABs, the byte enable function controls all 10 bits in the widest mode.
- The MSB and LSB of the byteena signal correspond to the MSB and LSB of the data bus, respectively.
- The byte enables are active high.
2.7.1. Byte Enable Controls in Memory Blocks
byteena[1:0] | Data Bits Written | |
---|---|---|
11 (default) | [19:10] | [9:0] |
10 | [19:10] | — |
01 | — | [9:0] |
byteena[3:0] | Data Bits Written | |||
---|---|---|---|---|
1111 (default) | [39:30] | [29:20] | [19:10] | [9:0] |
1000 | [39:30] | — | — | — |
0100 | — | [29:20] | — | — |
0010 | — | — | [19:10] | — |
0001 | — | — | — | [9:0] |
2.7.2. Data Byte Output
In M10K blocks, the corresponding masked data byte output appears as a “don’t care” value.
In M20K blocks or MLABs, when you de-assert a byte-enable bit during a write cycle, the corresponding data byte output appears as either a “don't care” value or the current data at that location. You can control the output value for the masked byte in the M20K blocks or MLABs by using the Intel® Quartus® Prime software.
2.7.3. RAM Blocks Operations
2.8. Memory Blocks Packed Mode Support
The M20K and M10K memory blocks support packed mode.
The packed mode feature packs two independent single-port RAM blocks into one memory block. The Intel® Quartus® Prime software automatically implements packed mode where appropriate by placing the physical RAM block in true dual-port mode and using the MSB of the address to distinguish between the two logical RAM blocks. The size of each independent single-port RAM must not exceed half of the target block size.
2.9. Memory Blocks Address Clock Enable Support
The embedded memory blocks support address clock enable, which holds the previous address value for as long as the signal is enabled (addressstall = 1). When the memory blocks are configured in dual-port mode, each port has its own independent address clock enable. The default value for the address clock enable signal is low (disabled).
2.10. Memory Blocks Error Correction Code Support
ECC allows you to detect and correct data errors at the output of the memory. ECC can perform single-error correction, double-adjacent-error correction, and triple-adjacent-error detection in a 32-bit word. However, ECC cannot detect four or more errors.
The M20K blocks have built-in support for ECC when in x32-wide simple dual-port mode:
- The M20K runs slower than non-ECC simple-dual port mode when ECC is engaged. However, you can enable optional ECC pipeline registers before the output decoder to achieve the same performance as non-ECC simple-dual port mode at the expense of one cycle of latency.
- The M20K ECC status is communicated with two ECC status flag signals—e (error) and ue (uncorrectable error). The status flags are part of the regular output from the memory block. When ECC is engaged, you cannot access two of the parity bits because the ECC status flag replaces them.
2.10.1. Error Correction Code Truth Table
e (error) eccstatus[1] |
ue (uncorrectable error) eccstatus[0] |
Status |
---|---|---|
0 | 0 | No error. |
0 | 1 | Illegal. |
1 | 0 | A correctable error occurred and the error has been corrected at the outputs; however, the memory array has not been updated. |
1 | 1 | An uncorrectable error occurred and uncorrectable data appears at the outputs. |
If you engage ECC:
- You cannot use the byte enable feature.
- Read-during-write old data mode is not supported.
2.11. Embedded Memory Blocks in Arria V Devices Revision History
Date | Version | Changes |
---|---|---|
December 2015 | 2015.12.21 | Changed instances of Quartus II to Quartus Prime. |
January 2015 | 2015.01.23 |
|
June 2014 | 2014.06.30 | Added information about MLAB memory blocks support for simultaneous read/write operations. MLAB memory blocks only support simultaneous read/write operations when operating in single clock mode. |
May 2013 | 2013.05.06 |
|
November 2012 | 2012.11.19 |
|
June 2012 | 2.0 |
|
November 2011 | 1.1 |
|
May 2011 | 1.0 | Initial release. |
3. Variable Precision DSP Blocks in Arria V Devices
This chapter describes how the variable-precision digital signal processing (DSP) blocks in Arria® V devices are optimized to support higher bit precision in high-performance DSP applications.
3.1. Features
The Arria® V variable precision DSP blocks offer the following features:
- High-performance, power-optimized, and fully registered multiplication operations
- 9-bit, 18-bit, 27-bit, and 36-bit 2 word lengths
- 18 x 19 and 18 x 25 complex multiplications 2
- Built-in addition, subtraction, and 64-bit accumulation unit to combine multiplication results
- Cascading 19-bit or 27-bit to form the tap-delay line for filtering applications
- Cascading 64-bit output bus to propagate output results from one block to the next block without external logic support
- Hard pre-adder supported in 18-bit , 19-bit, and 27-bit mode for symmetric filters
- Internal coefficient register bank for filter implementation
- 18-bit and 27-bit systolic finite impulse response (FIR) filters with distributed output adder
3.2. Supported Operational Modes in Arria V Devices
Variable-Precision DSP Block Resource | Operation Mode | Supported Instance | Pre-Adder Support | Coefficient Support | Input Cascade Support3 | Chainout Support |
---|---|---|---|---|---|---|
1 variable precision DSP block | Independent 9 x 9 multiplication | 3 | No | No | No | No |
Independent 18 x 18 multiplication | 2 | Yes | Yes | Yes | No | |
Independent 18 x 19 multiplication | 2 | Yes | Yes | Yes | No | |
Independent 18 x 25 multiplication | 1 | Yes | Yes | Yes | Yes | |
Independent 20 x 24 multiplication | 1 | Yes | Yes | Yes | Yes | |
Independent 27 x 27 multiplication | 1 | Yes | Yes | Yes | Yes | |
Two 18 x 19 multiplier adder mode | 1 | Yes | Yes | Yes | Yes | |
18 x 18 multiplier adder summed with 36-bit input | 1 | Yes | No | No | Yes | |
2 variable precision DSP blocks | Complex 18 x 19 multiplication | 1 | No | No | Yes | No |
Variable Precision DSP Block Resources | Operational Mode | Supported Instance | Pre-adder Support | Coefficient Support | Input Cascade Support | Chainout Support |
---|---|---|---|---|---|---|
1 variable precision DSP block | Independent 9 x 9 multiplication | 3 | No | No | No | No |
Independent 16 x 16 multiplication | 2 | Yes | Yes | Yes | No | |
Independent 18 x 18 partial multiplication (32-bit) | 2 | Yes | Yes | Yes | No | |
Independent 18 x 18 multiplication | 1 | Yes | Yes | Yes | No | |
Independent 27 x 27 multiplication | 1 | Yes | Yes | Yes | Yes | |
Independent 36 x 18 multiplication | 1 | No | Yes | No | Yes | |
Two 18 x 18 multiplier adder | 1 | Yes | Yes | Yes | Yes | |
Two 16 x 16 multiplier adder | 1 | Yes | Yes | Yes | Yes | |
Sum of 2 square | 1 | Yes 4 | No | No | Yes | |
18 x 18 multiplication summed with 36-bit input | 1 | No | No | No | Yes | |
2 variable precision DSP blocks | Independent 18 x 18 multiplication | 3 | No | No | No | No |
Independent 36 x 36 multiplication | 1 | No | No | No | No | |
Complex 18 x 18 multiplication | 1 | Yes | Yes | Yes | Yes | |
Four 18 x 18 multiplier adder | 1 | Yes | Yes | Yes | No | |
Two 27 x 27 multiplier adder | 1 | Yes | Yes | Yes | No | |
Two 18 x 36 multiplier adder | 1 | No | Yes | No | No | |
3 variable precision DSP blocks | Complex 18 x 25 multiplication | 1 | Yes4 | No | No | No |
4 variable precision DSP blocks | Complex 27 x 27 multiplication | 1 | Yes | Yes | Yes | No |
3.3. Resources
Variant | Member Code | Variable-precision DSP Block | Independent Input and Output Multiplications Operator | 18 x 18 Multiplier Adder Mode | 18 x 18 Multiplier Adder Summed with 36 bit Input | |||
---|---|---|---|---|---|---|---|---|
9 x 9 Multiplier | 18 x 18 Multiplier | 27 x 27 Multiplier | 36 x 36 Multiplier | |||||
Arria V GX | A1 | 240 | 720 | 480 | 240 | — | 240 | 240 |
A3 | 396 | 1,188 | 792 | 396 | — | 396 | 396 | |
A5 | 600 | 1,800 | 1,200 | 600 | — | 600 | 600 | |
A7 | 800 | 2,400 | 1,600 | 800 | — | 800 | 800 | |
B1 | 920 | 2,760 | 1,840 | 920 | — | 920 | 920 | |
B3 | 1,045 | 3,135 | 2,090 | 1,045 | — | 1,045 | 1,045 | |
B5 | 1,092 | 3,276 | 2,184 | 1,092 | — | 1,092 | 1,092 | |
B7 | 1,156 | 3,468 | 2,312 | 1,156 | — | 1,156 | 1,156 | |
Arria V GT | C3 | 396 | 1,188 | 792 | 396 | — | 396 | 396 |
C7 | 800 | 2,400 | 1,600 | 800 | — | 800 | 800 | |
D3 | 1,045 | 3,135 | 2,090 | 1,045 | — | 1,045 | 1,045 | |
D7 | 1,156 | 3,468 | 2,312 | 1,156 | — | 1,156 | 1,156 | |
Arria V GZ | E1 | 800 | 2,400 | 1,600 | 800 | 400 | 800 | 800 |
E3 | 1,044 | 3,132 | 2,088 | 1,044 | 522 | 1,044 | 1,044 | |
E5 | 1,092 | 3,276 | 2,184 | 1,092 | 546 | 1,092 | 1,092 | |
E7 | 1,139 | 3,417 | 2,278 | 1,139 | 569 | 1,139 | 1,139 | |
Arria V SX | B3 | 809 | 2,427 | 1,618 | 809 | — | 809 | 809 |
B5 | 1,090 | 3,270 | 2,180 | 1,090 | — | 1,090 | 1,090 | |
Arria V ST | D3 | 809 | 2,427 | 1,618 | 809 | — | 809 | 809 |
D5 | 1,090 | 3,270 | 2,180 | 1,090 | — | 1,090 | 1,090 |
3.4. Design Considerations
You should consider the following elements in your design:
- Operational modes
- Internal coefficient and pre-adder
- Accumulator
- Chainout adder
3.4.1. Operational Modes
The Intel® Quartus® Prime software includes IP cores that you can use to control the operation mode of the multipliers. After entering the parameter settings with the IP Catalog, the Intel® Quartus® Prime software automatically configures the variable precision DSP block.
Altera provides two methods for implementing various modes of the Arria® V variable precision DSP block in a design—using the Intel® Quartus® Prime DSP IP cores and HDL inferring.
The following Intel® Quartus® Prime IP cores are supported for the Arria® V variable precision DSP blocks implementation:
- LPM_MULT
- ALTERA_MULT_ADD
- ALTMULT_COMPLEX
- ALTMEMMULT
3.4.2. Internal Coefficient and Pre-Adder
To use the pre-adder feature, all input data and multipliers must have the same clock setting.
The input cascade support is not available when you enable the pre-adder feature.
Mode | Arria® V GX, GT, SX, and ST | Arria® V GZ |
---|---|---|
18-bit |
The coefficient feature and pre-adder feature can be used independently. |
The coefficient feature must be enabled when the pre-adder feature is enabled. |
27-bit |
The coefficient feature and pre-adder feature can be used independently. |
The coefficient feature and pre-adder feature can be used independently. With pre-adder enabled:
|
3.4.3. Accumulator
The accumulator in the Arria® V GX, GT, SX, and ST devices supports double accumulation by enabling the 64-bit double accumulation registers located between the output register bank and the accumulator.
The double accumulation registers are set statically in the programming file.
The accumulator in the Arria® V GZ devices does not support double accumulation. The accumulator feature is not available in multi-block modes.
3.4.4. Chainout Adder
You can use the output chaining path to add results from other DSP blocks.
3.5. Block Architecture
The Arria® V variable precision DSP block consists of the following elements:
- Input register bank
- Pre-adder
- Internal coefficient
- Multipliers
- Adder
- Accumulator and chainout adder
- Systolic registers
- Double accumulation register
- Output register bank
If the variable precision DSP block is not configured in systolic FIR mode, both systolic registers are bypassed.
3.5.1. Input Register Bank
The input register bank consists of data, dynamic control signals, and two sets of delay registers.
All the registers in the DSP blocks are positive-edge triggered and cleared on power up. Each multiplier operand can feed an input register or a multiplier directly, bypassing the input registers.
The following variable precision DSP block signals control the input registers within the variable precision DSP block:
- CLK[2..0]
- ENA[2..0]
- ACLR[0]
In 18 x 18 and 18 x 19 mode, you can use the delay registers to balance the latency requirements when you use both the input cascade and chainout features.
The tap-delay line feature allows you to drive the top leg of the multiplier inputs from general routing or from the cascade chain. The following inputs can be driven from either the general routing or from the cascade chain:
- For
Arria® V GX,
GT, SX, and ST devices:
- dataa_y0 and datab_y1 in 18 x 19 mode
- dataa_y0 in 27 x 27 mode
- For
Arria® V GZ
devices:
- dataa_y0[17..0] and datab_y1[17..0] in 18 x 18 mode
- dataa_y0 in 27 x 27 mode
The Arria® V GZ variable precision DSP block support 18-bit and 27-bit input cascading.
These figures show the input registers for Arria® V devices.
3.5.2. Pre-Adder
Arria V GX, GT, SX, and ST Devices
Each variable precision DSP block has two 19-bit pre-adders. You can configure these pre-adders in the following configurations:
- Two independent 19-bit pre-adders
- One 27-bit pre-adder
The pre-adder supports both addition and subtraction in the following input configurations:
- 18-bit (signed) addition or subtraction for 18 x 19 mode
- 17-bit (unsigned) addition or subtraction for 18 x 19 mode
- 26-bit addition or subtraction for 27 x 27 mode
Arria V GZ Devices
Each variable precision DSP block has two 18-bit pre-adders. You can configure these pre-adders in the following configurations:
- Two independent 18-bit adders
- One 26-bit adder
The pre-adder supports both addition and subtraction in the following input configurations:
- 17-bit addition or subtraction for 18-bit applications
- 25-bit addition or subtraction for 27-bit applications
3.5.3. Internal Coefficient
The Arria® V variable precision DSP block has the flexibility of selecting the multiplicand from either the dynamic input or the internal coefficient.
The internal coefficient can support up to eight constant coefficients for the multiplicands in 18-bit and 27-bit modes. When you enable the internal coefficient feature, COEFSELA/COEFSELB are used to control the selection of the coefficient multiplexer.
3.5.4. Multipliers
A single variable precision DSP block can perform many multiplications in parallel, depending on the data width of the multiplier.
There are two multipliers per variable precision DSP block. You can configure these two multipliers in several operational modes .
For Arria® V GX, GT, SX, and ST devices:
- One 27 x 27 multiplier
- Two 18 (signed)/(unsigned) x 19 (signed) multipliers
- Three 9 x 9 multipliers
For Arria® V GZ devices:
- One 27 x 27 multiplier
- Two individual 16 x 16 multipliers
- Two individual 18 x 18 partial multipliers, with only 32-bit LSB multiplication result for each multiplication
- One individual 18 x 18 multiplier, with full 36-bit multiplication result
- One individual 27 x 27 multiplier
- One individual 36 x 18 multiplier
- There individual 9 x 9 multipliers
For Arria® V GZ devices, you can use two adjacent DSP blocks to construct an individual 36-bit multiplier.
3.5.5. Adder
You can use the adder in various sizes, depending on the operational mode:
- One 64-bit adder with the 64-bit accumulator
- Two 18 x 19 modes—the adder is divided into two 37-bit adders to produce the full 37-bit result of each independent 18 x 19 multiplication
- Three 9 x 9 modes—you can use the adder as three 18-bit adders to produce three 9 x 9 multiplication results independently
3.5.6. Accumulator and Chainout Adder
The Arria® V variable precision DSP block supports a 64-bit accumulator and a 64-bit adder.
For Arria® V GX, GT, SX, and ST devices, the accumulator and chainout adder features are not supported in two independent 18 x 19 modes and three independent 9 x 9 modes.
For Arria® V GZ devices, you can use the 64-bit adder as full adder.
The following signals can dynamically control the function of the accumulator:
- NEGATE
- LOADCONST
- ACCUMULATE
Function | Description | NEGATE | LOADCONST | ACCUMULATE |
---|---|---|---|---|
Zeroing | Disables the accumulator. | 0 | 0 | 0 |
Preload | Loads an initial value to the accumulator. Only one bit of the 64-bit preload value can be “1”. It can be used as rounding the DSP result to any position of the 64-bit result. | 0 | 1 | 0 |
Accumulation | Adds the current result to the previous accumulate result. | 0 | X | 1 |
Decimation | This function takes the current result, converts it into two’s complement, and adds it to the previous result. | 1 | X | 1 |
3.5.7. Systolic Registers
There are two systolic registers per variable precision DSP block. If the variable precision DSP block is not configured in systolic FIR mode, both systolic registers are bypassed.
The first set of systolic registers consists of the following registers:
- 18-bit and 19-bit registers that are used to register the 18-bit and 19-bit inputs of the upper multiplier respectively for Arria® V GX, GT, SX, and ST devices
- 18-bit registers that are used to register the 18-bit inputs of the upper multiplier for Arria® V GZ devices
The second set of systolic registers are used to delay the chainout output to the next variable precision DSP block.
You must clock all the systolic registers with the same clock source as the output register bank.
3.5.8. Double Accumulation Register
The double accumulation register is an extra register in the feedback path of the accumulator. Enabling the double accumulation register will cause an extra clock cycle delay in the feedback path of the accumulator.
This register has the same CLK, ENA, and ACLR settings as the output register bank.
By enabling this register, you can have two accumulator channels using the same number of variable precision DSP block.
Double accumulation register is not available in Arria® V GZ devices.
3.5.9. Output Register Bank
The positive edge of the clock signal triggers the 64-bit bypassable output register bank and is cleared after power up.
The following variable precision DSP block signals control the output register per variable precision DSP block:
- CLK[2..0]
- ENA[2..0]
- ACLR[1]
3.6. Operational Mode Descriptions
This section describes how you can configure an Arria® V variable precision DSP block to efficiently support the following operational modes:
- Independent Multiplier Mode
- Independent Complex Multiplier Mode
- Multiplier Adder Sum Mode
- Sum of Square Mode (Arria V GZ only)
- 18 x 18 Multiplication Summed with 36-Bit Input Mode
- Systolic FIR Mode
3.6.1. Independent Multiplier Mode
In independent input and output multiplier mode, the variable precision DSP blocks perform individual multiplication operations for general purpose multipliers.
Configuration | Multipliers per block | Device Variant Support |
---|---|---|
9 x 9 | 3 | All |
16 x 16 | 1 | Arria® V GZ |
18 x 18 (partial) | 1 | Arria® V GZ |
18 x 18 | 1 | Arria® V GZ |
18 (signed) x 18 (unsigned) | 2 | Arria® V GX, GT, SX, ST |
18 (unsigned) x 18 (unsigned) | 2 | Arria® V GX, GT, SX, ST |
18 (signed) x 19 (signed) | 2 | Arria® V GX, GT, SX, ST |
18 (unsigned) x 19 (signed) | 2 | Arria® V GX, GT, SX, ST |
18 x 25 | 1 | Arria® V GX, GT, SX, ST |
20 x 24 | 1 | Arria® V GX, GT, SX, ST |
27 x 27 | 1 | All |
36 x 18 | 1 | Arria® V GZ |
Configuration | Number of DSP Blocks Required | Device Variant Support |
---|---|---|
3 independent 18 x 18 multipliers | 2 | Arria® V GZ |
36 x 36 multiplier | 2 | Arria® V GZ |
3.6.1.1. 9 x 9 Independent Multiplier
3.6.1.2. 18 x 18 Independent Multiplier
3.6.1.3. 18 x 18 or 18 x 19 Independent Multiplier
In this figure, the variables are defined as follows:
- n = 19 and m = 37 for 18 x 19 mode
- n = 18 and m = 36 for 18 x 18 mode
3.6.1.4. 16 x 16 Independent Multiplier or 18 x 18 Independent Partial Multiplier
In this figure, the inputs for 16-bit independent multiplier mode are data[15..0]. The unused input bits require padding with zero.
For two independent 18 x 18 partial multiplier mode, only 32-bit LSB result for each multiplication operation is routed to the output. The output has full precision if the total width of the multiplicand input is less than or equal to 32 bits for each multiplier.
3.6.1.5. 18 x 25 Independent Multiplier
3.6.1.6. 20 x 24 Independent Multiplier
3.6.1.7. 27 x 27 Independent Multiplier
3.6.1.8. 36 x 18 Independent Multiplier
3.6.1.9. 36-Bit Independent Multiplier
You can efficiently construct an individual 36-bit multiplier with two adjacent variable precision DSP blocks. The 36 x 36 multiplication consists of four 18 x 18 multipliers.
The 36-bit multiplier is useful for applications requiring more than 18-bit precision; for example, for the mantissa multiplication portion of very high precision fixed-point arithmetic applications.
3.6.2. Independent Complex Multiplier Mode
The Arria® V variable precision DSP block provides the means for a complex multiplication.
Configuration | Number of DSP Blocks Required | Device Variant Support |
---|---|---|
18 x 18 | 2 | Arria® V GZ |
18 x 19 | 2 | Arria® V GX, GT, SX, ST |
18 x 25 | 3 | Arria® V GZ |
27 x 27 | 4 | Arria® V GZ |
3.6.2.1. 18 x 18 Complex Multiplier
For 18 x 18 complex multiplication mode, you require two variable precision DSP blocks to perform this multiplication.
You can implement the imaginary part [(a × d) + (b × c)] in the first variable precision DSP block, and you can implement the real part [(a × c) – (b × d)] in the second variable precision DSP block.
3.6.2.2. 18 x 19 Complex Multiplier
For 18 x 19 complex multiplication mode, you require two variable precision DSP blocks to perform this multiplication.
The imaginary part [(a × d) + (b × c)] is implemented in the first variable precision DSP block, while the real part [(a × c) - (b × d)] is implemented in the second variable precision DSP block.
3.6.2.3. 18 x 25 Complex Multiplier
Arria® V GZ devices support an individual 18 x 25 complex multiplication mode.
A 27 x 27 multiplier allows you to implement an individual 18 x 25 complex multiplication mode with three variable precision DSP blocks only. The pre-adder feature is automatically enabled for you to implement an individual 18 x 25 complex multiplication mode efficiently.
3.6.2.4. 27 x 27 Complex Multiplier
Arria® V GZ devices support an individual 27 x 27 complex multiplication mode. You require four variable precision DSP blocks to implement an individual 27 x 27 complex multiplication mode.
You can implement the imaginary part [(a x d) + (b x c)] in the first and second variable precision DSP blocks, and you can implement the real part [(a x c) - (b x d)] in the third and fourth variable precision DSP blocks.
You can achieve the difference of two 27 x 27 multiplications by enabling the NEGATE control signal in the fourth variable precision DSP block.
3.6.3. Multiplier Adder Sum Mode
Mode | Configuration | Number of DSP Blocks Required | Device Variant Support |
---|---|---|---|
Two-multiplier Adder Sum | 16 x 16 | 1 | Arria® V GZ |
18 x 18 | 1 | Arria® V GZ | |
18 x 19 | 1 | Arria® V GX, GT, SX, ST | |
27 x 27 | 2 | Arria® V GZ | |
36 x 18 | 2 | Arria® V GZ | |
Four-multiplier Adder Sum | 18 x 18 | 2 | Arria® V GZ |
3.6.3.1. One Sum of Two 18 x 18 Multipliers or Two 16 x 16 Multipliers
In this figure, for 18-bit multiplier adder sum mode, the input data width is 18 bits and the output data width is 37 bits.
For 16-bit multiplier adder sum mode, the input data width is 16 bits and the unused input bit requires padding with zeroes. The output data width is 33 bits.
3.6.3.2. One Sum of Two 18 x 19 Multipliers
3.6.3.3. One Sum of Two 27 x 27 Multipliers
3.6.3.4. One Sum of Two 36 x 18 Multipliers
3.6.3.5. One Sum of Four 18 x 18 Multipliers
3.6.4. Sum of Square Mode
The Arria® V variable precision DSP block can implement one sum of square mode.
You can feed the four 18-bit inputs into the pre-adder block to convert b and d input as two’s complement numbers to perform subtraction, if required.
You can feed each 18-bit pre-adder block output into both multiplicand and multiplier inputs of an 18 x 18 multiplier to generate a square result.
3.6.5. 18 x 18 Multiplication Summed with 36-Bit Input Mode
Arria® V variable precision DSP blocks support one 18 x 18 multiplication summed to a 36-bit input.
Use the upper multiplier to provide the input for an 18 x 18 multiplication, while the bottom multiplier is bypassed.
The following signals are concatenated to produce a 36-bit input:
- Arria® V GX, GT, SX, and ST devices: datab_y1[17..0] and datab_y1[35..18]
- Arria® V GZ devices: data1[17..0] and data1[35..18]
3.6.6. Systolic FIR Mode
The basic structure of a FIR filter consists of a series of multiplications followed by an addition.
Depending on the number of taps and the input sizes, the delay through chaining a high number of adders can become quite large. To overcome the delay performance issue, the systolic form is used with additional delay elements placed per tap to increase the performance at the cost of increased latency.
Arria® V variable precision DSP blocks support the following systolic FIR structures:
- 18-bit
- 27-bit
In systolic FIR mode, the input of the multiplier can come from four different sets of sources:
- Two dynamic inputs
- One dynamic input and one coefficient input
- One coefficient input and one pre-adder output
- One dynamic input and one pre-adder output (for Arria® V GX, GT, SX, and ST devices only)
3.6.6.1. 18-Bit Systolic FIR Mode
In 18-bit systolic FIR mode, the adders are configured as dual 44-bit adders, thereby giving 8 bits of overhead when using an 18-bit operation (36-bit products). This allows a total of 256 multiplier products.
3.6.6.2. 27-Bit Systolic FIR Mode
In 27-bit systolic FIR mode, the chainout adder or accumulator is configured for a 64-bit operation, providing 10 bits of overhead when using a 27-bit data (54-bit products). This allows a total of 1,024 multiplier products.
The 27-bit systolic FIR mode allows the implementation of one stage systolic filter per DSP block.
3.7. Variable Precision DSP Blocks in Arria V Devices Revision History
Date | Version | Changes |
---|---|---|
December 2015 | 2015.12.21 | Changed instances of Quartus II to Quartus Prime. |
June 2015 | 2015.06.12 |
|
May 2015 | 2015.05.08 | Added footnote in Features section to clarify certain features are only applicable to certain Arria V device variant. |
June 2014 | 2014.06.30 | Updated the supported megafunctions from ALTMULT_ADD and ALTMULT _ACCUM to ALTERA_MULT_ADD. |
May 2013 | 2013.05.06 |
|
November 2012 | 2012.11.29 |
|
June 2012 | 2.0 |
Updated for the Quartus II software v12.0 release:
|
May 2011 | 1.0 | Initial release. |
4. Clock Networks and PLLs in Arria V Devices
This chapter describes the advanced features of hierarchical clock networks and phase-locked loops (PLLs) in Arria® V devices. The Intel® Quartus® Prime software enables the PLLs and their features without external devices.
4.1. Clock Networks
The Arria® V devices contain the following clock networks that are organized into a hierarchical structure:
- Global clock (GCLK) networks
- Regional clock (RCLK) networks
- Periphery clock (PCLK) networks
4.1.1. Clock Resources in Arria V Devices
Clock Resource | Device | Number of Resources Available | Source of Clock Resource |
---|---|---|---|
Clock input pins |
|
40 single-ended or 20 differential | CLK[0..7][p,n] and CLK[12..23][p,n] pins |
|
40 single-ended or 20 differential | CLK[0..11][p,n] and CLK[16..23][p,n] pins | |
|
48 single-ended or 24 differential | CLK[0..23][p,n] pins | |
GCLK and RCLK networks |
|
76 | CLK[0..7][p,n] and CLK[12..23][p,n] pins, PLL clock outputs, and logic array |
|
82 | CLK[0..11][p,n] and CLK[16..23][p,n] pins, PLL clock outputs, and logic array | |
|
88 | CLK[0..23][p,n] pins, PLL clock outputs, and logic array | |
Arria® V GZ E1, E3, E5, and E7 | 92 | ||
PCLK networks |
|
120 | DPA clock outputs, PLD-transceiver interface clocks, I/O pins, and logic array |
|
184 | ||
|
208 | ||
Arria® V GZ E1 and E3 | 210 | ||
|
224 | ||
|
248 | ||
Arria® V GZ E5 and E7 | 282 |
For more information about the clock input pins connections, refer to the pin connection guidelines.
4.1.2. Types of Clock Networks
4.1.2.1. Global Clock Networks
Arria® V devices provide GCLKs that can drive throughout the device. The GCLKs serve as low-skew clock sources for functional blocks, such as adaptive logic modules (ALMs), digital signal processing (DSP), embedded memory, and PLLs. Arria® V I/O elements (IOEs) and internal logic can also drive GCLKs to create internally-generated global clocks and other high fan-out control signals, such as synchronous or asynchronous clear and clock enable signals.
4.1.2.2. Regional Clock Networks
RCLK networks are only applicable to the quadrant they drive into. RCLK networks provide the lowest clock insertion delay and skew for logic contained within a single device quadrant. The Arria® V IOEs and internal logic within a given quadrant can also drive RCLKs to create internally generated regional clocks and other high fan-out control signals.
4.1.2.3. Periphery Clock Networks
Depending on the routing direction, Arria® V devices provide vertical PCLKs from the top and bottom periphery, and horizontal PCLKs from the left and right periphery.
Clock outputs from the dynamic phase aligner (DPA) block, programmable logic device (PLD)-transceiver interface clocks, I/O pins, and internal logic can drive the PCLK networks.
PCLKs have higher skew when compared with GCLK and RCLK networks. You can use PCLKs for general purpose routing to drive signals into and out of the Arria® V device.
4.1.3. Clock Sources Per Quadrant
The Arria® V devices provide 30 section clock (SCLK) networks in each spine clock per quadrant. The SCLK networks can drive six row clocks in each logic array block (LAB) row, nine column I/O clocks, and two core reference clocks. The SCLKs are the clock resources to the core functional blocks, PLLs, and I/O interfaces of the device.
A spine clock is another layer of routing between the GCLK, RCLK, and PCLK networks before each clock is connected to the clock routing for each LAB row. The settings for spine clocks are transparent. The Intel® Quartus® Prime software automatically routes the spine clock based on the GCLK, RCLK, and PCLK networks.
The following figure shows SCLKs driven by the GCLK, RCLK, PCLK, or the PLL feedback clock networks in each spine clock per quadrant. The GCLK, RCLK, PCLK, and PLL feedback clocks share the same routing to the SCLKs. To ensure successful design fitting in the Intel® Quartus® Prime software, the total number of clock resources must not exceed the SCLK limits in each region.
4.1.4. Types of Clock Regions
This section describes the types of clock regions in Arria® V devices.
4.1.4.1. Entire Device Clock Region
To form the entire device clock region, a source drives a signal in a GCLK network that can be routed through the entire device. The source is not necessarily a clock signal. This clock region has the maximum insertion delay when compared with other clock regions, but allows the signal to reach every destination in the device. It is a good option for routing global reset and clear signals or routing clocks throughout the device.
4.1.4.2. Regional Clock Region
To form a regional clock region, a source drives a signal in a RCLK network that you can route throughout one quadrant of the device. This clock region provides the lowest skew in a quadrant. It is a good option if all the destinations are in a single quadrant.
4.1.4.3. Dual-Regional Clock Region
To form a dual-regional clock region, a single source (a clock pin or PLL output) generates a dual-regional clock by driving two RCLK networks (one from each quadrant). This technique allows destinations across two adjacent device quadrants to use the same low-skew clock. The routing of this signal on an entire side has approximately the same delay as a RCLK region. Internal logic can also drive a dual-regional clock network.
4.1.5. Clock Network Sources
In Arria® V devices, clock input pins, PLL outputs, high-speed serial interface (HSSI) outputs, DPA outputs, and internal logic can drive the GCLK, RCLK, and PCLK networks.
4.1.5.1. Dedicated Clock Input Pins
You can use the dedicated clock input pins (CLK[0..23][p,n]) for high fan-out control signals, such as asynchronous clears, presets, and clock enables, for protocol signals through the GCLK or RCLK networks.
CLK pins can be either differential clocks or single-ended clocks. When you use the CLK pins as single-ended clock inputs, only the CLK<#>p pins have dedicated connections to the PLL. The CLK<#>n pins drive the PLLs over global or regional clock networks and do not have dedicated routing paths to the PLLs.
Driving a PLL over a global or regional clock can lead to higher jitter at the PLL input, and the PLL will not be able to fully compensate for the global or regional clock. Altera recommends using the CLK<#>p pins for optimal performance when you use single-ended clock inputs to drive the PLLs.
4.1.5.2. Internal Logic
You can drive each GCLK, RCLK, and horizontal PCLK network using LAB-routing and row clock to enable internal logic to drive a high fan-out, low-skew signal.
4.1.5.3. DPA Outputs
Every DPA generates one PCLK to the core.
4.1.5.4. HSSI Outputs
Every three HSSI outputs generate a group of six PCLKs to the core.
4.1.5.5. PLL Clock Outputs
The Arria® V PLL clock outputs can drive both GCLK and RCLK networks.
4.1.5.6. Clock Input Pin Connections to GCLK and RCLK Networks
Clock Resources | CLK (p/n Pins) |
---|---|
GCLK[0,1,2,3] | CLK[0,1,2,3,20,21,22,23] |
GCLK[4,5,6,7] | CLK[4,5,6,7] |
GCLK[8,9,10,11] | CLK[8,9,10,11] and 5 CLK[12,13,14,15] 6 |
GCLK[12,13,14,15] | CLK[16,17,18,19] |
Clock Resources | CLK (p/n Pins) |
---|---|
RCLK[58,59,60,61,62,63,64,68,82,86] | CLK[0] |
RCLK[58,59,60,61,62,63,65,69,83,87] | CLK[1] |
RCLK[58,59,60,61,62,63,66,84] | CLK[2] |
RCLK[58,59,60,61,62,63,67,85] | CLK[3] |
RCLK[20,24,28,30,34,38] | CLK[4] |
RCLK[21,25,29,31,35,39] | CLK[5] |
RCLK[22,26,32,36] | CLK[6] |
RCLK[23,27,33,37] | CLK[7] |
RCLK[52,53,54,55,56,57,70,74,76,80] | CLK[8] 5 |
RCLK[52,53,54,55,56,57,71,75,77,81] | CLK[9] 5 |
RCLK[52,53,54,55,56,57,72,78] | CLK[10] 5 |
RCLK[52,53,54,55,56,57,73,79] | CLK[11] 5 |
RCLK[46,47,48,49,50,51,70,74,76,80] 7 | CLK[12] 6 |
RCLK[46,47,48,49,50,51,71,75,77,81] 7 | CLK[13] 6 |
RCLK[46,47,48,49,50,51,72,78] 7 | CLK[14] 6 |
RCLK[46,47,48,49,50,51,73,79] 7 | CLK[15] 6 |
RCLK[0,4,8,10,14,18] | CLK[16] |
RCLK[1,5,9,11,15,19] | CLK[17] |
RCLK[2,6,12,16] | CLK[18] |
RCLK[3,7,13,17] | CLK[19] |
RCLK[40,41,42,43,44,45,64,68,82,86] | CLK[20] |
RCLK[40,41,42,43,44,45,65,69,83,87] | CLK[21] |
RCLK[40,41,42,43,44,45,66,84] | CLK[22] |
RCLK[40,41,42,43,44,45,67,85] | CLK[23] |
Clock Resources | CLK (p/n Pins) |
---|---|
RCLK[58,59,60,61,62,63,64,68,85,89] | CLK[0] |
RCLK[58,59,60,61,62,63,65,69,86,90] | CLK[1] |
RCLK[58,59,60,61,62,63,66,70,87,91] | CLK[2] |
RCLK[58,59,60,61,62,63,67,88] | CLK[3] |
RCLK[20,24,28,30,34,38] | CLK[4] |
RCLK[21,25,29,31,35,39] | CLK[5] |
RCLK[22,26,32,36] | CLK[6] |
RCLK[23,27,33,37] | CLK[7] |
RCLK[52,53,54,55,56,57,71,75,78,82] | CLK[8] |
RCLK[52,53,54,55,56,57,72,76,79,83] | CLK[9] |
RCLK[52,53,54,55,56,57,73,77,80,84] | CLK[10] |
RCLK[52,53,54,55,56,57,74,81] | CLK[11] |
RCLK[46,47,48,49,50,51,71,75,78,82] | CLK[12] |
RCLK[46,47,48,49,50,51,72,76,79,83] | CLK[13] |
RCLK[46,47,48,49,50,51,73,77,80,84] | CLK[14] |
RCLK[46,47,48,49,50,51,74,81] | CLK[15] |
RCLK[0,4,8,10,14,18] | CLK[16] |
RCLK[1,5,9,11,15,19] | CLK[17] |
RCLK[2,6,12,16] | CLK[18] |
RCLK[3,7,13,17] | CLK[19] |
RCLK[40,41,42,43,44,45,64,68,85,89] | CLK[20] |
RCLK[40,41,42,43,44,45,65,69,86,90] | CLK[21] |
RCLK[40,41,42,43,44,45,66,70,87,91] | CLK[22] |
RCLK[40,41,42,43,44,45,67,88] | CLK[23] |
4.1.6. Clock Output Connections
For Arria® V PLL connectivity to GCLK and RCLK networks, refer to the PLL connectivity to GCLK and RCLK networks spreadsheet.
4.1.7. Clock Control Block
Every GCLK, RCLK, and PCLK network has its own clock control block. The control block provides the following features:
- Clock source selection (dynamic selection available only for GCLKs)
- Global clock multiplexing
- Clock power down (static or dynamic clock enable or disable available only for GCLKs and RCLKs)
4.1.7.1. Pin Mapping in Arria V Devices
Clock | Fed by |
---|---|
inclk[0] and inclk[1] | Any of the four dedicated clock pins on the same side of the Arria® V device. |
inclk[2] | PLL counters C0 and C2 from the two center PLLs on the same side of the Arria® V devices. |
inclk[3] | PLL counters C1 and C3 from the two center PLLs on the same side of the Arria® V devices. |
4.1.7.2. GCLK Control Block
You can select the clock source for the GCLK select block either statically or dynamically using internal logic to drive the multiplexer-select inputs.
When selecting the clock source dynamically, you can select either PLL outputs (such as C0 or C1), or a combination of clock pins or PLL outputs.
4.1.7.3. RCLK Control Block
You can only control the clock source selection for the RCLK select block statically using configuration bit settings in the configuration file (.sof or .pof) generated by the Intel® Quartus® Prime software.
You can set the input clock sources and the clkena signals for the GCLK and RCLK network multiplexers through the Intel® Quartus® Prime software using the ALTCLKCTRL IP core.
4.1.7.4. PCLK Control Block
To drive the HSSI horizontal PCLK control block, select the HSSI output or internal logic .
To drive the DPA vertical PCLK, select the DPA clock output or internal logic . You can only use the DPA clock output to generate the vertical PCLK to the core.
4.1.7.5. External PLL Clock Output Control Block
You can enable or disable the dedicated external clock output pins using the ALTCLKCTRL IP core.
4.1.8. Clock Power Down
You can power down the GCLK and RCLK clock networks using both static and dynamic approaches.
When a clock network is powered down, all the logic fed by the clock network is in off-state, reducing the overall power consumption of the device. The unused GCLK, RCLK, and PCLK networks are automatically powered down through configuration bit settings in the configuration file (.sof or .pof) generated by the Intel® Quartus® Prime software.
The dynamic clock enable or disable feature allows the internal logic to control power-up or power-down synchronously on the GCLK and RCLK networks, including dual-regional clock regions. This feature is independent of the PLL and is applied directly on the clock network.
4.1.9. Clock Enable Signals
You cannot use the clock enable and disable circuit of the clock control block if the GCLK or RCLK output drives the input of a PLL.
The clkena signals are supported at the clock network level instead of at the PLL output counter level. This allows you to gate off the clock even when you are not using a PLL. You can also use the clkena signals to control the dedicated external clocks from the PLLs.
Arria® V devices have an additional metastability register that aids in asynchronous enable and disable of the GCLK and RCLK networks. You can optionally bypass this register in the Intel® Quartus® Prime software.
The PLL can remain locked, independent of the clkena signals, because the loop-related counters are not affected. This feature is useful for applications that require a low-power or sleep mode. The clkena signal can also disable clock outputs if the system is not tolerant of frequency overshoot during resynchronization.
4.2. Arria V PLLs
PLLs provide robust clock management and synthesis for device clock management, external system clock management, and high-speed I/O interfaces.
The Arria® V device family contains fractional PLLs that can function as fractional PLLs or integer PLLs. The output counters in Arria® V devices are dedicated to each fractional PLL that support integer or fractional frequency synthesis.
Two adjacent PLLs share 18 C output counters. Any number of C counters can be assigned to each PLL, as long as the total number used by the two PLLs is 18 or less.
The Arria® V devices offer up to 16 fractional PLLs in the larger densities. All Arria® V fractional PLLs have the same core analog structure and features support.
Feature | Support |
---|---|
Integer PLL | Yes |
Fractional PLL | Yes |
C output counters | 18 |
M, N, C counter sizes | 1 to 512 |
Dedicated external clock outputs | 4 single-ended or 2 single-ended and 1 differential |
Dedicated clock input pins | 4 single-ended or 4 differential |
External feedback input pin | Single-ended or differential |
Spread-spectrum input clock tracking | Yes 8 |
Source synchronous compensation | Yes |
Direct compensation | Yes |
Normal compensation | Yes |
Zero-delay buffer compensation | Yes |
External feedback compensation | Yes |
LVDS compensation | Yes |
Voltage-controlled oscillator (VCO) output drives the DPA clock | Yes |
Phase shift resolution | 78.125 ps 9 |
Programmable duty cycle | Yes |
Power down mode | Yes |
4.2.1. PLL Physical Counters in Arria V Devices
The physical counters for the fractional PLLs are arranged in the following sequences:
- Up-to-down
- Down-to-up
4.2.2. PLL Locations in Arria V Devices
Arria® V devices provide PLLs for the transceiver channels. These PLLs are located in a strip, where the strip refers to an area in the FPGA.
The total number of PLLs in the Arria® V devices includes the PLLs in the PLL strip. However, the transceivers can only use the PLLs located in the strip.
The following figures show the physical locations of the fractional PLLs. Every index represents one fractional PLL in the device. The physical locations of the fractional PLLs correspond to the coordinates in the Quartus II Chip Planner.
4.2.3. PLL Migration Guidelines
If you plan to migrate your design between Arria® V GX A5, A7, B1, B3, B5, and B7 devices, and Arria® V GT C7, D3, and D7 devices, and your design requires a PLL to drive the HSSI and clock network (GCLK or RCLK), use the PLLs on the left and right side of the device.
Variant | Member Code | PLL Location | |
---|---|---|---|
Left Side | Right Side | ||
Arria® V GX | A5, A7 | FRACTIONALPLL_X0_Y14, FRACTIONALPLL_X0_Y23 | FRACTIONALPLL_X132_Y14, FRACTIONALPLL_X132_Y23 |
B1, B3 | FRACTIONALPLL_X0_Y18, FRACTIONALPLL_X0_Y27 | FRACTIONALPLL_X169_Y18, FRACTIONALPLL_X169_Y27 | |
B5, B7 | FRACTIONALPLL_X0_Y10, FRACTIONALPLL_X0_Y19 | FRACTIONALPLL_X183_Y10, FRACTIONALPLL_X183_Y19 | |
Arria® V GT | C7 | FRACTIONALPLL_X0_Y14, FRACTIONALPLL_X0_Y23 | FRACTIONALPLL_X132_Y14, FRACTIONALPLL_X132_Y23 |
D3 | FRACTIONALPLL_X0_Y18, FRACTIONALPLL_X0_Y27 | FRACTIONALPLL_X169_Y18, FRACTIONALPLL_X169_Y27 | |
D7 | FRACTIONALPLL_X0_Y10, FRACTIONALPLL_X0_Y19 | FRACTIONALPLL_X183_Y10, FRACTIONALPLL_X183_Y19 |
4.2.4. Fractional PLL Architecture
4.2.4.1. Fractional PLL Usage
You can configure the fractional PLL to function either in the integer or in the enhanced fractional mode. One fractional PLL can use up to 18 output counters and all external clock outputs. Two adjacent fractional PLLs share the 18 output counters.
Fractional PLLs can be used as follows:
- Reduce the number of required oscillators on the board
- Reduce the clock pins used in the FPGA by synthesizing multiple clock frequencies from a single reference clock source
- Compensate clock network delay
- Zero delay buffering
- Transmit clocking for transceivers
4.2.5. PLL Cascading
Arria® V devices support two types of PLL cascading.
PLL-to-PLL Cascading
This cascading mode synthesizes a more precise output frequency than a single PLL in integer mode. Cascading two PLLs in integer mode expands the effective range of the pre-scale counter, N and the multiply counter, M.
Arria® V devices use two types of input clock sources.
- The adjpllin input clock source is used for inter-cascading between fracturable fractional PLLs.
- The cclk input clock source is used for intra-cascading within fracturable fractional PLLs.
Altera recommends using a low bandwidth setting for the source (upstream) PLL and a high bandwidth setting for destination (downstream) PLL.
Counter-Output-to-Counter-Output Cascading
This cascading mode synthesizes a lower frequency output than a single post-scale counter, C. Cascading two C counters expands the effective range of C counters.
4.2.6. PLL External Clock I/O Pins
Two adjacent fractional PLLs share four dual-purpose clock I/O pins, organized as one of the following combinations:
- Four single-ended clock outputs
- Two single-ended outputs and one differential clock output
- Four single-ended clock outputs and two single-ended feedback inputs in the I/O driver feedback for zero delay buffer (ZDB) mode support
- Two single-ended clock outputs and two single-ended feedback inputs for single-ended external feedback (EFB) mode support
- One differential clock output and one differential feedback input for differential EFB support (only one of the two adjacent fractional PLLs can support differential EFB at one time while the other fractional PLL can be used for general-purpose clocking)
The following figure shows that any of the output counters (C[0..17] ) or the M counter on the PLLs can feed the dedicated external clock outputs. Therefore, one counter or frequency can drive all output pins available from a given PLL.
Each pin of a single-ended output pair can be either in-phase or 180° out-of-phase. To implement the 180° out-of-phase pin in a pin pair, the Intel® Quartus® Prime software places a NOT gate in the design into the IOE.
The clock output pin pairs support the following I/O standards:
- Same I/O standard for the pin pairs
- LVDS
- Differential high-speed transceiver logic (HSTL)
- Differential SSTL
Arria® V PLLs can drive out to any regular I/O pin through the GCLK or RCLK network. You can also use the external clock output pins as user I/O pins if you do not require external PLL clocking.
4.2.7. PLL Control Signals
You can use the areset signal to control PLL operation and resynchronization, and use the locked signal to observe the status of the PLL.
4.2.7.1. areset
The areset signal is the reset or resynchronization input for each PLL. The device input pins or internal logic can drive these input signals.
When areset is driven high, the PLL counters reset, clearing the PLL output and placing the PLL out-of-lock. The VCO is then set back to its nominal setting. When areset is driven low again, the PLL resynchronizes to its input as it re-locks.
You must assert the areset signal every time the PLL loses lock to guarantee the correct phase relationship between the PLL input and output clocks. You can set up the PLL to automatically reset (self-reset) after a loss-of-lock condition using the Intel® Quartus® Prime IP Catalog.
You must include the areset signal if either of the following conditions is true:
- PLL reconfiguration or clock switchover is enabled in the design
- Phase relationships between the PLL input and output clocks must be maintained after a loss-of-lock condition
4.2.7.2. locked
The locked signal output of the PLL indicates the following conditions:
- The PLL has locked onto the reference clock.
- The PLL clock outputs are operating at the desired phase and frequency set in the IP Catalog.
The lock detection circuit provides a signal to the core logic. The signal indicates when the feedback clock has locked onto the reference clock both in phase and frequency.
4.2.8. Clock Feedback Modes
This section describes the following clock feedback modes:
- Source synchronous
- LVDS compensation
- Direct
- Normal compensation
- ZDB
- EFB
Each mode allows clock multiplication and division, phase shifting, and programmable duty cycle.
The input and output delays are fully compensated by a PLL only when using the dedicated clock input pins associated with a given PLL as the clock source.
The input and output delays may not be fully compensated in the Intel® Quartus® Prime software for the following conditions:
- When a GCLK or RCLK network drives the PLL
- When the PLL is driven by a dedicated clock pin that is not associated with the PLL
For example, when you configure a PLL in ZDB mode, the PLL input is driven by an associated dedicated clock input pin. In this configuration, a fully compensated clock path results in zero delay between the clock input and one of the clock outputs from the PLL. However, if the PLL input is fed by a non-dedicated input (using the GCLK network), the output clock may not be perfectly aligned with the input clock.
4.2.8.1. Source Synchronous Mode
If the data and clock arrive at the same time on the input pins, the same phase relationship is maintained at the clock and data ports of any IOE input register. Data and clock signals at the IOE experience similar buffer delays as long as you use the same I/O standard.
Altera recommends source synchronous mode for source synchronous data transfers.
The source synchronous mode compensates for the delay of the clock network used and any difference in the delay between the following two paths:
- Data pin to the IOE register input
- Clock input pin to the PLL phase frequency detector (PFD) input
The Arria® V PLL can compensate multiple pad-to-input-register paths, such as a data bus when it is set to use source synchronous compensation mode.
4.2.8.2. LVDS Compensation Mode
The purpose of LVDS compensation mode is to maintain the same data and clock timing relationship seen at the pins of the internal serializer/deserializer (SERDES) capture register, except that the clock is inverted (180° phase shift). Thus, LVDS compensation mode ideally compensates for the delay of the LVDS clock network, including the difference in delay between the following two paths:
- Data pin-to-SERDES capture register
- Clock input pin-to-SERDES capture register
The output counter must provide the 180° phase shift.
4.2.8.3. Direct Mode
In direct mode, the PLL does not compensate for any clock networks. This mode provides better jitter performance because the clock feedback into the PFD passes through less circuitry. Both the PLL internal- and external-clock outputs are phase-shifted with respect to the PLL clock input.
4.2.8.4. Normal Compensation Mode
An internal clock in normal compensation mode is phase-aligned to the input clock pin. The external clock output pin has a phase delay relative to the clock input pin if connected in this mode. The Intel® Quartus® Prime Timing Analyzer reports any phase difference between the two. In normal compensation mode, the delay introduced by the GCLK or RCLK network is fully compensated.
4.2.8.5. Zero-Delay Buffer Mode
In ZDB mode, the external clock output pin is phase-aligned with the clock input pin for zero delay through the device. This mode is supported on all Arria® V PLLs.
When using this mode, you must use the same I/O standard on the input clocks and clock outputs to guarantee clock alignment at the input and output pins. You cannot use differential I/O standards on the PLL clock input or output pins.
To ensure phase alignment between the clk pin and the external clock output (CLKOUT) pin in ZDB mode, instantiate a bidirectional I/O pin in the design. The bidirectional I/O pin serves as the feedback path connecting the fbout and fbin ports of the PLL. The bidirectional I/O pin must always be assigned a single-ended I/O standard. The PLL uses this bidirectional I/O pin to mimic and compensate for the output delay from the clock output port of the PLL to the external clock output pin.
4.2.8.6. External Feedback Mode
In EFB mode, the output of the M counter (fbout) feeds back to the PLL fbin input (using a trace on the board) and becomes part of the feedback loop.
One of the dual-purpose external clock outputs becomes the fbin input pin in this mode. The external feedback input pin, fbin is phase-aligned with the clock input pin. Aligning these clocks allows you to remove clock delay and skew between devices.
When using EFB mode, you must use the same I/O standard on the input clock, feedback input, and clock outputs.
4.2.9. Clock Multiplication and Division
Each Arria® V PLL provides clock synthesis for PLL output ports using the M/(N × C) scaling factors. The input clock is divided by a pre-scale factor, N, and is then multiplied by the M feedback factor. The control loop drives the VCO to match fin × (M/N).
The Intel® Quartus® Prime software automatically chooses the appropriate scaling factors according to the input frequency, multiplication, and division values entered into the ALTERA_PLL IP core.
VCO Post Divider
A VCO post divider is inserted after the VCO. When you enable the VCO post divider, the VCO post divider divides the VCO frequency by two. When the VCO post divider is bypassed, the VCO frequency goes to the output port without being divided by two.
Post-Scale Counter, C
Each output port has a unique post-scale counter, C, that divides down the output from the VCO post divider. For multiple PLL outputs with different frequencies, the VCO is set to the least common multiple of the output frequencies that meets its frequency specifications. For example, if the output frequencies required from one PLL are 33 and 66 MHz, the Intel® Quartus® Prime software sets the VCO to 660 MHz (the least common multiple of 33 and 66 MHz within the VCO range). Then the post-scale counters, C, scale down the VCO frequency for each output port.
Pre-Scale Counter, N and Multiply Counter, M
Each PLL has one pre-scale counter, N, and one multiply counter, M, with a range of 1 to 512 for both M and N. The N counter does not use duty-cycle control because the only purpose of this counter is to calculate frequency division. The post-scale counters have a 50% duty cycle setting. The high- and low-count values for each counter range from 1 to 256. The sum of the high- and low-count values chosen for a design selects the divide value for a given counter.
Delta-Sigma Modulator
The delta-sigma modulator (DSM) is used together with the M multiply counter to enable the PLL to operate in fractional mode. The DSM dynamically changes the M counter divide value on a cycle to cycle basis. The different M counter values allow the "average" M counter value to be a non-integer.
Fractional Mode
In fractional mode, the M counter divide value equals to the sum of the "clock high" count, "clock low" count, and the fractional value. The fractional value is equal to K/2^X , where K is an integer between 0 and (2^X – 1), and X = 8, 16, 24, or 32.
Integer Mode
For PLL operating in integer mode, M is an integer value and DSM is disabled.
4.2.10. Programmable Phase Shift
The programmable phase shift feature allows the PLLs to generate output clocks with a fixed phase offset.
The VCO frequency of the PLL determines the precision of the phase shift. The minimum phase shift increment is 1/8 of the VCO period. For example, if a PLL operates with a VCO frequency of 1000 MHz, phase shift steps of 125 ps are possible.
The Intel® Quartus® Prime software automatically adjusts the VCO frequency according to the user-specified phase shift values entered into the IP core.
4.2.11. Programmable Duty Cycle
The programmable duty cycle allows PLLs to generate clock outputs with a variable duty cycle. This feature is supported on the PLL post-scale counters.
The duty-cycle setting is achieved by a low and high time-count setting for the post-scale counters. To determine the duty cycle choices, the Intel® Quartus® Prime software uses the frequency input and the required multiply or divide rate.
The post-scale counter value determines the precision of the duty cycle. The precision is defined as 50% divided by the post-scale counter value. For example, if the C0 counter is 10, steps of 5% are possible for duty-cycle choices from 5% to 90%. If the PLL is in external feedback mode, set the duty cycle for the counter driving the fbin pin to 50%.
Combining the programmable duty cycle with programmable phase shift allows the generation of precise non-overlapping clocks.
4.2.12. Clock Switchover
The clock switchover feature allows the PLL to switch between two reference input clocks. Use this feature for clock redundancy or for a dual-clock domain application where a system turns on the redundant clock if the previous clock stops running. The design can perform clock switchover automatically when the clock is no longer toggling or based on a user control signal, extswitch.
The following clock switchover modes are supported in Arria® V PLLs:
- Automatic switchover—The clock sense circuit monitors the current reference clock. If the current reference clock stops toggling, the reference clock automatically switches to inclk0 or inclk1 clock.
- Manual clock switchover—Clock switchover is controlled using the extswitch signal. When the extswitch signal goes from logic low to logic high, and stays high for at least three clock cycles, the reference clock to the PLL is switched from inclk0 to inclk1, or vice-versa.
- Automatic switchover with manual override—This mode combines automatic switchover and manual clock switchover. When the extswitch signal goes high, it overrides the automatic clock switchover function.
4.2.12.1. Automatic Switchover
Arria® V PLLs support a fully configurable clock switchover capability.
When the current reference clock is not present, the clock sense block automatically switches to the backup clock for PLL reference. You can select a clock source as the backup clock by connecting it to the inclk1 port of the PLL in your design.
The clock switchover circuit sends out three status signals—clkbad[0], clkbad[1], and activeclock—from the PLL to implement a custom switchover circuit in the logic array.
In automatic switchover mode, the clkbad[0] and clkbad[1] signals indicate the status of the two clock inputs. When they are asserted, the clock sense block detects that the corresponding clock input has stopped toggling. These two signals are not valid if the frequency difference between inclk0 and inclk1 is greater than 20%.
The activeclock signal indicates which of the two clock inputs (inclk0 or inclk1) is being selected as the reference clock to the PLL. When the frequency difference between the two clock inputs is more than 20%, the activeclock signal is the only valid status signal.
Use the switchover circuitry to automatically switch between inclk0 and inclk1 when the current reference clock to the PLL stops toggling. You can switch back and forth between inclk0 and inclk1 any number of times when one of the two clocks fails and the other clock is available.
For example, in applications that require a redundant clock with the same frequency as the reference clock, the switchover state machine generates a signal (clksw) that controls the multiplexer select input. In this case, inclk1 becomes the reference clock for the PLL.
When using automatic clock switchover mode, the following requirements must be satisfied:
- Both clock inputs must be running when the FPGA is configured.
- The period of the two clock inputs can differ by no more than 20%.
If the current clock input stops toggling while the other clock is also not toggling, switchover is not initiated and the clkbad[0..1] signals are not valid. If both clock inputs are not the same frequency, but their period difference is within 20%, the clock sense block detects when a clock stops toggling. However, the PLL may lose lock after the switchover is completed and needs time to relock.
4.2.12.2. Automatic Switchover with Manual Override
In automatic switchover with manual override mode, you can use the extswitch signal for user- or system-controlled switch conditions. You can use this mode for same-frequency switchover, or to switch between inputs of different frequencies.
For example, if inclk0 is 66 MHz and inclk1 is 200 MHz, you must control switchover using the extswitch signal. The automatic clock-sense circuitry cannot monitor clock input (inclk0 and inclk1) frequencies with a frequency difference of more than 100% (2×).
This feature is useful when the clock sources originate from multiple cards on the backplane, requiring a system-controlled switchover between the frequencies of operation.
You must choose the backup clock frequency and set the M, N, C, and K counters so that the VCO operates within the recommended operating frequency range. The ALTERA_PLL IP Catalog notifies you if a given combination of inclk0 and inclk1 frequencies cannot meet this requirement.
In automatic override with manual switchover mode, the activeclock signal mirrors the extswitch signal. Since both clocks are still functional during the manual switch, neither clkbad signal goes high. Because the switchover circuit is positive-edge sensitive, the falling edge of the extswitch signal does not cause the circuit to switch back from inclk1 to inclk0. When the extswitch signal goes high again, the process repeats.
The extswitch signal and automatic switch work only if the clock being switched to is available. If the clock is not available, the state machine waits until the clock is available.
4.2.12.3. Manual Clock Switchover
In manual clock switchover mode, the extswitch signal controls whether inclk0 or inclk1 is selected as the input clock to the PLL. By default, inclk0 is selected.
A clock switchover event is initiated when the extswitch signal transitions from logic low to logic high, and being held high for at least three inclk cycles.
You must bring the extswitch signal back low again for PLL to re-gain lock. If you do not require another switchover event, you can leave the extswitch signal in a logic low state.
Pulsing the extswitch signal high for at least three inclk cycles performs another switchover event.
If inclk0 and inclk1 are different frequencies and are always running, the extswitch signal minimum high time must be greater than or equal to three of the slower frequency inclk0 and inclk1 cycles.
You can delay the clock switchover action by specifying the switchover delay in the ALTERA_PLL IP core. When you specify the switchover delay, the extswitch signal must be held high for at least three inclk cycles plus the number of the delay cycles that has been specified to initiate a clock switchover.
4.2.12.4. Guidelines
When implementing clock switchover in Arria® V PLLs, use the following guidelines:
- Automatic clock switchover requires that the inclk0 and inclk1 frequencies be within 20% of each other. Failing to meet this requirement causes the clkbad[0] and clkbad[1] signals to not function properly.
- When using manual clock switchover, the difference between inclk0 and inclk1 can be more than 100% (2×). However, differences in frequency, phase, or both, of the two clock sources will likely cause the PLL to lose lock. Resetting the PLL ensures that you maintain the correct phase relationships between the input and output clocks.
- Both inclk0 and inclk1 must be running when the extswitch signal goes high to initiate the manual clock switchover event. Failing to meet this requirement causes the clock switchover to not function properly.
- Applications that require a clock switchover feature and a small frequency drift must use a low-bandwidth PLL. When referencing input clock changes, the low-bandwidth PLL reacts more slowly than a high-bandwidth PLL. When switchover happens, a low-bandwidth PLL propagates the stopping of the clock to the output more slowly than a high-bandwidth PLL. However, be aware that the low-bandwidth PLL also increases lock time.
- After a switchover occurs, there may be a finite resynchronization period for the PLL to lock onto a new clock. The time it takes for the PLL to relock depends on the PLL configuration.
- The phase relationship between the input clock to the PLL and the output clock from the PLL is important in your design. Assert areset for at least 10 ns after performing a clock switchover. Wait for the locked signal to go high and be stable before re-enabling the output clocks from the PLL.
- The VCO frequency gradually decreases when the current clock is lost and then increases as the VCO locks on to the backup clock, as shown in the following figure.
4.2.13. PLL Reconfiguration and Dynamic Phase Shift
For more information about PLL reconfiguration and dynamic phase shifting, refer to AN661.
4.3. Clock Networks and PLLs in Arria V Devices Revision History
Document Version | Changes |
---|---|
2019.04.26 |
|
Date | Version | Changes |
---|---|---|
December 2016 | 2016.12.09 | Added a note to dedicated refclk pin in Fractional PLL High-Level Block Diagram. |
December 2015 | 2015.12.21 | Changed instances of Quartus II to Quartus Prime. |
January 2015 | 2015.01.23 |
|
January 2014 | 2014.01.10 |
|
May 2013 | 2013.05.06 |
|
November 2012 | 2012.11.19 |
|
June 2012 | 2.0 |
|
November 2011 | 1.1 | Restructured chapter. |
May 2011 | 1.0 | Initial release. |
5. I/O Features in Arria V Devices
This chapter provides details about the features of the Arria® V I/O elements (IOEs) and how the IOEs work in compliance with current and emerging I/O standards and requirements.
The Arria® V I/Os support the following features:
- Single-ended, non-voltage-referenced, and voltage-referenced I/O standards
- Low-voltage differential signaling (LVDS), RSDS, mini-LVDS, HSTL, HSUL, and SSTL I/O standards
- Serializer/deserializer (SERDES)
- Programmable output current strength
- Programmable slew rate
- Programmable bus-hold
- Programmable pull-up resistor
- Programmable pre-emphasis
- Programmable I/O delay
- Programmable voltage output differential (VOD)
- Open-drain output
- On-chip series termination (RS OCT) with and without calibration
- On-chip parallel termination (RT OCT)
- On-chip differential termination (RD OCT)
5.1. I/O Resources Per Package for Arria V Devices
The following package plan tables for the different Arria® V variants list the maximum I/O resources available for each package.
Member Code |
F672 |
F896 |
F1152 |
F1517 |
||||
---|---|---|---|---|---|---|---|---|
GPIO | XCVR | GPIO | XCVR | GPIO | XCVR | GPIO | XCVR | |
A1 | 336 | 9 | 416 | 9 | — | — | — | — |
A3 | 336 | 9 | 416 | 9 | — | — | — | — |
A5 | 336 | 9 | 384 | 18 | 544 | 24 | — | — |
A7 | 336 | 9 | 384 | 18 | 544 | 24 | — | — |
B1 | — | — | 384 | 18 | 544 | 24 | 704 | 24 |
B3 | — | — | 384 | 18 | 544 | 24 | 704 | 24 |
B5 | — | — | — | — | 544 | 24 | 704 | 36 |
B7 | — | — | — | — | 544 | 24 | 704 | 36 |
Member Code |
F672 |
F896 |
F1152 |
F1517 |
||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
GPIO | XCVR | GPIO | XCVR | GPIO | XCVR | GPIO | XCVR | |||||
6-Gbps | 10-Gbps | 6-Gbps | 10-Gbps | 6-Gbps | 10-Gbps | 6-Gbps | 10-Gbps | |||||
C3 | 336 | 3 (9) | 4 | 416 | 3 (9) | 4 | — | — | — | — | — | — |
C7 | — | — | — | 384 | 6 (18) | 8 | 544 | 6 (24) | 12 | — | — | — |
D3 | — | — | — | 384 | 6 (18) | 8 | 544 | 6 (24) | 12 | 704 | 6 (24) | 12 |
D7 | — | — | — | — | — | — | 544 | 6 (24) | 12 | 704 | 6 (36) | 20 |
Member Code |
H780 |
F1152 |
F1517 |
|||
---|---|---|---|---|---|---|
GPIO | XCVR | GPIO | XCVR | GPIO | XCVR | |
E1 | 342 | 12 | 414 | 24 | — | — |
E3 | 342 | 12 | 414 | 24 | — | — |
E5 | — | — | 534 | 24 | 674 | 36 |
E7 | — | — | 534 | 24 | 674 | 36 |
Member Code |
F896 |
F1152 |
F1517 |
||||||
---|---|---|---|---|---|---|---|---|---|
FPGA GPIO | HPS I/O | XCVR | FPGA GPIO | HPS I/O | XCVR | FPGA GPIO | HPS I/O | XCVR | |
B3 | 250 | 208 | 12 | 385 | 208 | 18 | 540 | 208 | 30 |
B5 | 250 | 208 | 12 | 385 | 208 | 18 | 540 | 208 | 30 |
Member Code |
F896 |
F1152 |
F1517 |
|||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
FPGA GPIO | HPS I/O | XCVR | FPGA GPIO | HPS I/O | XCVR | FPGA GPIO | HPS I/O | XCVR | ||||
6 Gbps | 10 Gbps | 6 Gbps | 10 Gbps | 6 Gbps | 10 Gbps | |||||||
D3 | 250 | 208 | 12 | 6 | 385 | 208 | 18 | 8 | 540 | 208 | 30 | 16 |
D5 | 250 | 208 | 12 | 6 | 385 | 208 | 18 | 8 | 540 | 208 | 30 | 16 |
For more information about each device variant, refer to the device overview.
5.2. I/O Vertical Migration for Arria V Devices
You can achieve the vertical migration shaded in red if you use only up to 320 GPIOs, up to nine 6 Gbps transceiver channels, and up to four 10 Gbps transceiver (for Arria V GT devices). This migration path is not shown in the Intel® Quartus® Prime software Pin Migration View.
5.2.1. Verifying Pin Migration Compatibility
You can use the Pin Migration View window in the Intel® Quartus® Prime software Pin Planner to assist you in verifying whether your pin assignments migrate to a different device successfully. You can vertically migrate to a device with a different density while using the same device package, or migrate between packages with different densities and ball counts.
- Open Assignments > Pin Planner and create pin assignments.
-
If necessary,
perform one of the following options to populate the Pin Planner with the node
names in the design:
- Analysis & Elaboration
- Analysis & Synthesis
- Fully compile the design
- Then, on the menu, click View > Pin Migration View.
-
To select or change
migration devices:
- Click Device to open the Device dialog box.
- Under Migration compatibility click Migration Devices.
-
To show more
information about the pins:
- Right-click anywhere in the Pin Migration View window and select Show Columns.
- Then, click the pin feature you want to display.
- If you want to view only the pins, in at least one migration device, that have a different feature than the corresponding pin in the migration result, turn on Show migration differences.
-
Click Pin Finder to open the Pin Finder dialog box to find and highlight pins
with specific functionality.
If you want to view only the pins highlighted by the most recent query in the Pin Finder dialog box, turn on Show only highlighted pins.
- To export the pin migration information to a Comma-Separated Value file (.csv), click Export.
5.3. I/O Standards Support in Arria V Devices
This section lists the I/O standards supported in the FPGA I/Os and HPS I/Os of Arria® V devices, the typical power supply values for each I/O standard, and the MultiVolt I/O interface feature.
5.3.1. I/O Standards Support for FPGA I/O in Arria V Devices
I/O Standard | Device Variant Support | Standard Support |
---|---|---|
3.3 V LVTTL/3.3 V LVCMOS | All | JESD8-B |
3.0 V LVTTL/3.0 V LVCMOS | GX, GT, SX, and ST | JESD8-B |
3.0 V PCI | GX, GT, SX, and ST | PCI Rev. 2.2 |
3.0 V PCI-X 10 | GX, GT, SX, and ST | PCI-X Rev. 1.0 |
2.5 V LVCMOS | All | JESD8-5 |
1.8 V LVCMOS | All | JESD8-7 |
1.5 V LVCMOS | All | JESD8-11 |
1.2 V LVCMOS | All | JESD8-12 |
SSTL-2 Class I | All | JESD8-9B |
SSTL-2 Class II | All | JESD8-9B |
SSTL-18 Class I | All | JESD8-15 |
SSTL-18 Class II | All | JESD8-15 |
SSTL-15 Class I | All | — |
SSTL-15 Class II | All | — |
1.8 V HSTL Class I | All | JESD8-6 |
1.8 V HSTL Class II | All | JESD8-6 |
1.5 V HSTL Class I | All | JESD8-6 |
1.5 V HSTL Class II | All | JESD8-6 |
1.2 V HSTL Class I | All | JESD8-16A |
1.2 V HSTL Class II | All | JESD8-16A |
Differential SSTL-2 Class I | All | JESD8-9B |
Differential SSTL-2 Class II | All | JESD8-9B |
Differential SSTL-18 Class I | All | JESD8-15 |
Differential SSTL-18 Class II | All | JESD8-15 |
Differential SSTL-15 Class I | All | — |
Differential SSTL-15 Class II | All | — |
Differential 1.8 V HSTL Class I | All | JESD8-6 |
Differential 1.8 V HSTL Class II | All | JESD8-6 |
Differential 1.5 V HSTL Class I | All | JESD8-6 |
Differential 1.5 V HSTL Class II | All | JESD8-6 |
Differential 1.2 V HSTL Class I | All | JESD8-16A |
Differential 1.2 V HSTL Class II | All | JESD8-16A |
LVDS | All | ANSI/TIA/EIA-644 |
RSDS11 | All | — |
Mini-LVDS 12 | All | — |
LVPECL | All | — |
SSTL-15 | All | JESD79-3D |
SSTL-135 | All | — |
SSTL-125 | All | — |
SSTL-12 | GZ only | — |
HSUL-12 | All | — |
Differential SSTL-15 | All | JESD79-3D |
Differential SSTL-135 | All | — |
Differential SSTL-125 | All | — |
Differential SSTL-12 | GZ only | — |
Differential HSUL-12 | All | — |
5.3.2. I/O Standards Support for HPS I/O in Arria V Devices
I/O Standard | Standard Support | HPS Column I/O | HPS Row I/O |
---|---|---|---|
3.3 V LVTTL/3.3 V LVCMOS | JESD8-B | Yes | — |
3.0 V LVTTL/3.0 V LVCMOS | JESD8-B | Yes | — |
2.5 V LVCMOS | JESD8-5 | Yes | — |
1.8 V LVCMOS | JESD8-7 | Yes | Yes |
1.5 V LVCMOS | JESD8-11 | Yes | — |
SSTL-18 Class I | JESD8-15 | — | Yes |
SSTL-18 Class II | JESD8-15 | — | Yes |
SSTL-15 Class I | — | — | Yes |
SSTL-15 Class II | — | — | Yes |
1.5 V HSTL Class I | JESD8-6 | Yes | — |
1.5 V HSTL Class II | JESD8-6 | Yes | — |
SSTL-135 | — | — | Yes |
HSUL-12 | — | — | Yes |
5.3.3. I/O Standards Voltage Levels in Arria V Devices
I/O Standard | Device Variant Support | VCCIO (V) |
VCCPD (V) (Pre-Driver Voltage) |
VREF (V) 13 (Input Ref Voltage) |
VTT (V) (Board Termination Voltage) |
|
---|---|---|---|---|---|---|
Input 14 | Output | |||||
3.3 V LVTTL/3.3 V LVCMOS | GX, GT, SX, and ST | 3.3/3.0/2.5 | 3.3 | 3.3 | — | — |
GZ | 3.0/2.5 | 3.0 | 3.0 | — | — | |
3.0 V LVTTL/3.0 V LVCMOS | GX, GT, SX, and ST | 3.3/3.0/2.5 | 3.0 | 3.0 | — | — |
3.0 V PCI | 3.0 | 3.0 | 3.0 | — | — | |
3.0 V PCI-X | 3.0 | 3.0 | 3.0 | — | — | |
2.5 V LVCMOS | All | 3.3/3.0/2.5 | 2.5 | 2.5 | — | — |
1.8 V LVCMOS | All | 1.8/1.5 | 1.8 | 2.5 | — | — |
1.5 V LVCMOS | All | 1.8/1.5 | 1.5 | 2.5 | — | — |
1.2 V LVCMOS | All | 1.2 | 1.2 | 2.5 | — | — |
SSTL-2 Class I | All | VCCPD | 2.5 | 2.5 | 1.25 | 1.25 |
SSTL-2 Class II | All | VCCPD | 2.5 | 2.5 | 1.25 | 1.25 |
SSTL-18 Class I | All | VCCPD | 1.8 | 2.5 | 0.9 | 0.9 |
SSTL-18 Class II | All | VCCPD | 1.8 | 2.5 | 0.9 | 0.9 |
SSTL-15 Class I | All | VCCPD | 1.5 | 2.5 | 0.75 | 0.75 |
SSTL-15 Class II | All | VCCPD | 1.5 | 2.5 | 0.75 | 0.75 |
1.8 V HSTL Class I | All | VCCPD | 1.8 | 2.5 | 0.9 | 0.9 |
1.8 V HSTL Class II | All | VCCPD | 1.8 | 2.5 | 0.9 | 0.9 |
1.5 V HSTL Class I | All | VCCPD | 1.5 | 2.5 | 0.75 | 0.75 |
1.5 V HSTL Class II | All | VCCPD | 1.5 | 2.5 | 0.75 | 0.75 |
1.2 V HSTL Class I | All | VCCPD | 1.2 | 2.5 | 0.6 | 0.6 |
1.2 V HSTL Class II | All | VCCPD | 1.2 | 2.5 | 0.6 | 0.6 |
Differential SSTL-2 Class I | All | VCCPD | 2.5 | 2.5 | — | 1.25 |
Differential SSTL-2 Class II | All | VCCPD | 2.5 | 2.5 | — | 1.25 |
Differential SSTL-18 Class I | All | VCCPD | 1.8 | 2.5 | — | 0.9 |
Differential SSTL-18 Class II | All | VCCPD | 1.8 | 2.5 | — | 0.9 |
Differential SSTL-15 Class I | All | VCCPD | 1.5 | 2.5 | — | 0.75 |
Differential SSTL-15 Class II | All | VCCPD | 1.5 | 2.5 | — | 0.75 |
Differential 1.8 V HSTL Class I | All | VCCPD | 1.8 | 2.5 | — | 0.9 |
Differential 1.8 V HSTL Class II | All | VCCPD | 1.8 | 2.5 | — | 0.9 |
Differential 1.5 V HSTL Class I | All | VCCPD | 1.5 | 2.5 | — | 0.75 |
Differential 1.5 V HSTL Class II | All | VCCPD | 1.5 | 2.5 | — | 0.75 |
Differential 1.2 V HSTL Class I | All | VCCPD | 1.2 | 2.5 | — | 0.6 |
Differential 1.2 V HSTL Class II | All | VCCPD | 1.2 | 2.5 | — | 0.6 |
LVDS | All | VCCPD | 2.5 | 2.5 | — | — |
RSDS | All | VCCPD | 2.5 | 2.5 | — | — |
Mini-LVDS | All | VCCPD | 2.5 | 2.5 | — | — |
LVPECL (Differential clock input only) | All | VCCPD | — | 2.5 | — | — |
SSTL-15 | All | VCCPD | 1.5 | 2.5 | 0.75 | Typically does not require board termination |
SSTL-135 | All | VCCPD | 1.35 | 2.5 | 0.675 | |
SSTL-125 | All | VCCPD | 1.25 | 2.5 | 0.625 | |
SSTL-12 | GZ only | VCCPD | 1.2 | 2.5 | 0.6 | |
HSUL-12 | All | VCCPD | 1.2 | 2.5 | 0.6 | |
Differential SSTL-15 | All | VCCPD | 1.5 | 2.5 | — | Typically does not require board termination |
Differential SSTL-135 | All | VCCPD | 1.35 | 2.5 | — | |
Differential SSTL-125 | All | VCCPD | 1.25 | 2.5 | — | |
Differential SSTL-12 | GZ only | VCCPD | 1.2 | 2.5 | — | |
Differential HSUL-12 | All | VCCPD | 1.2 | 2.5 | — |
5.3.4. MultiVolt I/O Interface in Arria V Devices
The MultiVolt I/O interface feature allows Arria® V devices in all packages to interface with systems of different supply voltages.
VCCIO (V) | Device Variant Support | VCCPD (V) 0 | Input Signal (V) | Output Signal (V) |
---|---|---|---|---|
1.2 | All | 2.5 | 1.2 | 1.2 |
1.25 | All | 2.5 | 1.25 | 1.25 |
1.35 | All | 2.5 | 1.35 | 1.35 |
1.5 | All | 2.5 | 1.5, 1.8 | 1.5 |
1.8 | All | 2.5 | 1.5, 1.8 | 1.8 |
2.5 | All | 2.5 | 2.5, 3.0, 3.3 | 2.5 |
3.0 | GX, GT, SX, and ST | 3.0 | 2.5, 3.0, 3.3 | 3.0 |
GZ | 3.0 | 2.5, 3.0, 3.3 | 3.0, 3.3 | |
3.3 | GX, GT, SX, and ST | 3.3 | 2.5, 3.0, 3.3 | 3.3 |
The pin current may be slightly higher than the default value. Verify that the VOL maximum and VOH minimum voltages of the driving device do not violate the applicable VIL maximum and VIH minimum voltage specifications of the Arria® V device.
The VCCPD power pins must be connected to a 2.5 V, 3.0 V, or 3.3 V power supply. Using these power pins to supply the pre-driver power to the output buffers increases the performance of the output pins.
5.4. I/O Design Guidelines for Arria V Devices
There are several considerations that require your attention to ensure the success of your designs. Unless noted otherwise, these design guidelines apply to all variants of this device family.
5.4.1. Mixing Voltage-Referenced and Non-Voltage-Referenced I/O Standards
Each I/O bank can simultaneously support multiple I/O standards. The following sections provide guidelines for mixing non-voltage-referenced and voltage-referenced I/O standards in the devices.
5.4.1.1. Non-Voltage-Referenced I/O Standards
Each Arria® V I/O bank has its own VCCIO pins and supports only one VCCIO of 1.2, 1.25, 1.35, 1.5, 1.8, 2.5, 3.0, or 3.3 V 15. An I/O bank can simultaneously support any number of input signals with different I/O standard assignments if the I/O standards support the VCCIO level of the I/O bank.
For output signals, a single I/O bank supports non-voltage-referenced output signals that drive at the same voltage as VCCIO. Because an I/O bank can only have one VCCIO value, it can only drive out the value for non-voltage-referenced signals.
For example, an I/O bank with a 2.5 V VCCIO setting can support 2.5 V, 3.0 V and 3.3 V inputs but supports only 2.5 V output.
5.4.1.2. Voltage-Referenced I/O Standards
To accommodate voltage-referenced I/O standards:
- Each Arria V GX, GT, SX, or ST I/O bank contains a dedicated VREF pin.
- Each Arria V GZ I/O bank supports multiple dedicated VREF pins feeding a common VREF bus.
- Each bank can have only a single VCCIO voltage level and a single voltage reference (VREF) level.
An I/O bank featuring single-ended or differential standards can support different voltage-referenced standards if the VCCIO and VREF are the same levels.
For performance reasons, voltage-referenced input standards use their own VCCPD level as the power source. This feature allows you to place voltage-referenced input signals in an I/O bank with a VCCIO of 2.5 V or below. For example, you can place HSTL-15 input pins in an I/O bank with 2.5 V VCCIO. However, the voltage-referenced input with RT OCT enabled requires the VCCIO of the I/O bank to match the voltage of the input standard. RT OCT cannot be supported for the HSTL-15 I/O standard when VCCIO is 2.5 V.
Voltage-referenced bidirectional and output signals must be the same as the VCCIO voltage of the I/O bank. For example, you can place only SSTL-2 output pins in an I/O bank with a 2.5 V VCCIO.
5.4.1.3. Mixing Voltage-Referenced and Non-Voltage Referenced I/O Standards
An I/O bank can support voltage-referenced and non-voltage-referenced pins by applying each of the rule sets individually.
Examples:
- An I/O bank can support SSTL-18 inputs and outputs, and 1.8 V inputs and outputs with a 1.8 V VCCIO and a 0.9 V VREF.
- An I/O bank can support 1.5 V standards, 1.8 V inputs (but not outputs), and 1.5 V HSTL I/O standards with a 1.5 V VCCIO and 0.75 V VREF.
5.4.2. Guideline: Use the Same VCCPD for All I/O Banks in a Group
One VCCPD is shared in a group of I/O banks. If one I/O bank in a group uses 3.0 V VCCPD, other I/O banks in the same group must also use 3.0 V VCCPD.
The I/O banks with the same bank number form a group. For example, I/O banks 8A, 8B, 8C, and 8D form a group and share the same VCCPD. This sharing is applicable to all I/O banks, with the following exceptions:
- Arria® V GX and GT devices—No VCCPD sharing in bank 4A and 7A. Each of these I/O banks has their own individual VCCPD.
- Arria® V SX and ST devices—No VCCPD sharing in bank 4A. In these devices, banks 6A, 6B, and 7A through 7E are HPS I/O banks.
- Arria® V GZ devices—No VCCPD sharing across banks 3A, 3B, 3C, and 3D. Banks 3A and 3B form a group with one VCCPD while bank 3C (if available) and 3D form another group with its own VCCPD.
For the Arria® V GZ devices, if you are using an output or bidirectional pin with the 3.3 V LVTTL or 3.3 V LVCMOS I/O standard, you must adhere to this restriction manually with location assignments.
For more information about the I/O banks available in each device package, refer to the related information.
5.4.3. Guideline: Ensure Compatible VCCIO and VCCPD Voltage in the Same Bank
When planning I/O bank usage for Arria® V GX, GT, SX, and ST devices, you must ensure the VCCIO voltage is compatible with the VCCPD voltage of the same bank. Some banks may share the same VCCPD power pin. This limits the possible VCCIO voltages that can be used on banks that share VCCPD power pins.
Examples:
- VCCPD4BCD is connected to 2.5 V—VCCIO pins for banks 4B, 4C, and 4D can be connected 1.2 V, 1.25 V, 1.35 V, 1.5 V, 1.8 V, or 2.5 V.
- VCCPD4BCD is connected to 3.0 V—VCCIO pins for banks 4B, 4C, and 4D must be connected to 3.0 V.
5.4.4. Guideline: VREF Pin Restrictions
For the Arria® V GX, GT, SX, and ST devices, consider the following VREF pins guidelines:
- You cannot assign shared VREF pins as LVDS or external memory interface pins.
- SSTL, HSTL, and HSUL I/O standards do not support shared VREF pins. For example, if a particular B1p or B1n pin is a shared VREF pin, the corresponding B1p/B1n pin pair do not have LVDS transmitter support.
- Shared VREF pins will have reduced performance when used as normal I/Os.
- You must perform signal integrity analysis using your board design when using a shared VREF pin to determine the FMAX for your system.
For more information about pin capacitance of the VREF pins, refer to the device datasheet.
5.4.5. Guideline: Observe Device Absolute Maximum Rating for 3.3 V Interfacing
To ensure device reliability and proper operation when you use the device for 3.3 V I/O interfacing, do not violate the absolute maximum ratings of the device. For more information about absolute maximum rating and maximum allowed overshoot during transitions, refer to the device datasheet.
Transmitter Application
If you use the Arria® V device as a transmitter, use slow slew rate and series termination to limit the overshoot and undershoot at the I/O pins. Transmission line effects that cause large voltage deviations at the receiver are associated with an impedance mismatch between the driver and the transmission lines. By matching the impedance of the driver to the characteristic impedance of the transmission line, you can significantly reduce overshoot voltage. You can use a series termination resistor placed physically close to the driver to match the total driver impedance to the transmission line impedance.
Receiver Application
If you use the Arria V device as a receiver, to limit the overshoot and undershoot voltage at the I/O pins:
- Arria V GX, GT, SX, or ST—use the on-chip clamping diode.
- Arria V GZ device—use an off-chip clamping diode.
The 3.3 V I/O standard is supported using the bank supply voltage (VCCIO) at 3.0 V and a VCCPD voltage of 3.0 V. In this method, the clamping diode can sufficiently clamp overshoot voltage to within the DC and AC input voltage specifications. The clamped voltage is expressed as the sum of the VCCIO and the diode forward voltage.
5.4.6. Guideline: Use PLL Integer Mode for LVDS Applications
For LVDS applications, you must use the phase-locked loops (PLLs) in integer PLL mode.
5.4.7. Guideline: Pin Placement for General Purpose High-Speed Signals
- Avoid using HMC DQ pins as the input pin.
- Avoid using HMC DQ and command pins as the output pin.
I/O signals that use the hard memory controller pins are routed through the HMCPHY_RE routing elements. These routing elements have a higher routing delay compared to other I/O pins. To identify the hard memory controller pins for your Arria® V device and package, refer to the relevant pin-out files.
5.5. I/O Banks Locations in Arria V Devices
The number of Arria V I/O banks in a particular device depends on the device density.
5.6. I/O Banks Groups in Arria V Devices
The I/O pins in Arria® V devices are arranged in groups called modular I/O banks:
- Modular I/O banks have independent power supplies that allow each bank to support different I/O standards.
- Each modular I/O bank can support multiple I/O standards that use the same VCCIO and VCCPD voltages.
5.6.1. Modular I/O Banks for Arria V GX Devices
Member Code | A1 | A3 | A5 | A7 | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Package | F672 | F896 | F672 | F896 | F672 | F896 | F1152 | F672 | F896 | F1152 | |
Bank | 3A | 24 | 32 | 24 | 32 | 24 | 32 | 48 | 24 | 32 | 48 |
3B | — | — | — | — | — | — | 32 | — | — | 32 | |
3C | — | — | — | — | — | — | 32 | — | — | 32 | |
3D | 32 | 32 | 32 | 32 | 20 | 32 | 32 | 20 | 32 | 32 | |
4A | 16 | 16 | 16 | 16 | 28 | 32 | 32 | 28 | 32 | 32 | |
4B | — | 16 | — | 16 | 32 | 32 | 32 | 32 | 32 | 32 | |
4C | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | |
4D | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | |
5A | 32 | 48 | 32 | 48 | — | — | — | — | — | — | |
6A | 32 | 48 | 32 | 48 | — | — | — | — | — | — | |
7A | 16 | 16 | 16 | 16 | 28 | 32 | 32 | 28 | 32 | 32 | |
7B | — | 16 | — | 16 | 32 | 32 | 32 | 32 | 32 | 32 | |
7C | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | |
7D | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | |
8A | 24 | 32 | 24 | 32 | 24 | 32 | 48 | 24 | 32 | 48 | |
8B | — | — | — | — | — | — | 32 | — | — | 32 | |
8C | — | — | — | — | — | — | 32 | — | — | 32 | |
8D | 32 | 32 | 32 | 32 | 20 | 32 | 32 | 20 | 32 | 32 | |
Total | 336 | 416 | 336 | 416 | 336 | 384 | 544 | 336 | 384 | 544 |
Member Code | B1 | B3 | B5 | B7 | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Package | F896 | F1152 | F1517 | F896 | F1152 | F1517 | F1152 | F1517 | F1152 | F1517 | |
Bank | 3A | 32 | 48 | 48 | 32 | 48 | 48 | 48 | 48 | 48 | 48 |
3B | — | 32 | 32 | — | 32 | 32 | 32 | 32 | 32 | 32 | |
3C | — | 32 | 48 | — | 32 | 48 | 32 | 48 | 32 | 48 | |
3D | 32 | 32 | 48 | 32 | 32 | 48 | 32 | 48 | 32 | 48 | |
4A | 32 | 32 | 48 | 32 | 32 | 48 | 32 | 48 | 32 | 48 | |
4B | 32 | 32 | 48 | 32 | 32 | 48 | 32 | 48 | 32 | 48 | |
4C | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | |
4D | 32 | 32 | 48 | 32 | 32 | 48 | 32 | 48 | 32 | 48 | |
7A | 32 | 32 | 48 | 32 | 32 | 48 | 32 | 48 | 32 | 48 | |
7B | 32 | 32 | 48 | 32 | 32 | 48 | 32 | 48 | 32 | 48 | |
7C | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | |
7D | 32 | 32 | 48 | 32 | 32 | 48 | 32 | 48 | 32 | 48 | |
8A | 32 | 48 | 48 | 32 | 48 | 48 | 48 | 48 | 48 | 48 | |
8B | — | 32 | 32 | — | 32 | 32 | 32 | 32 | 32 | 32 | |
8C | — | 32 | 48 | — | 32 | 48 | 32 | 48 | 32 | 48 | |
8D | 32 | 32 | 48 | 32 | 32 | 48 | 32 | 48 | 32 | 48 | |
Total | 384 | 544 | 704 | 384 | 544 | 704 | 544 | 704 | 544 | 704 |
5.6.2. Modular I/O Banks for Arria V GT Devices
Member Code | C3 | C7 | D3 | D7 | ||||||
---|---|---|---|---|---|---|---|---|---|---|
Package | F672 | F896 | F896 | F1152 | F896 | F1152 | F1517 | F1152 | F1517 | |
Bank | 3A | 24 | 32 | 32 | 48 | 32 | 48 | 48 | 48 | 48 |
3B | — | — | — | 32 | — | 32 | 32 | 32 | 32 | |
3C | — | — | — | 32 | — | 32 | 48 | 32 | 48 | |
3D | 32 | 32 | 32 | 32 | 32 | 32 | 48 | 32 | 48 | |
4A | 16 | 16 | 32 | 32 | 32 | 32 | 48 | 32 | 48 | |
4B | — | 16 | 32 | 32 | 32 | 32 | 48 | 32 | 48 | |
4C | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | |
4D | 32 | 32 | 32 | 32 | 32 | 32 | 48 | 32 | 48 | |
5A | 32 | 48 | — | — | — | — | — | — | — | |
6A | 32 | 48 | — | — | — | — | — | — | — | |
7A | 16 | 16 | 32 | 32 | 32 | 32 | 48 | 32 | 48 | |
7B | — | 16 | 32 | 32 | 32 | 32 | 48 | 32 | 48 | |
7C | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | 32 | |
7D | 32 | 32 | 32 | 32 | 32 | 32 | 48 | 32 | 48 | |
8A | 24 | 32 | 32 | 48 | 32 | 48 | 48 | 48 | 48 | |
8B | — | — | — | 32 | — | 32 | 32 | 32 | 32 | |
8C | — | — | — | 32 | — | 32 | 48 | 32 | 48 | |
8D | 32 | 32 | 32 | 32 | 32 | 32 | 48 | 32 | 48 | |
Total | 336 | 416 | 384 | 544 | 384 | 544 | 704 | 544 | 704 |
5.6.3. Modular I/O Banks for Arria V GZ Devices
Member Code | E1 | E3 | E5 | E7 | |||||
---|---|---|---|---|---|---|---|---|---|
Package | F780 | F1152 | F780 | F1152 | F1152 | F1517 | F1152 | F1517 | |
Bank | 3A | 36 | 36 | 36 | 36 | 36 | 36 | 36 | 36 |
3B | 48 | 48 | 48 | 48 | 48 | 48 | 48 | 48 | |
3C | — | — | — | — | 48 | 48 | 48 | 48 | |
3D | 24 | 24 | 24 | 24 | 24 | 48 | 24 | 48 | |
4A | 24 | 24 | 24 | 24 | 24 | 24 | 24 | 24 | |
4B | — | 48 | — | 48 | 48 | 48 | 48 | 48 | |
4C | — | — | — | — | 48 | 48 | 48 | 48 | |
4D | 24 | 24 | 24 | 24 | 24 | 48 | 24 | 48 | |
7A | 24 | 24 | 24 | 24 | 24 | 24 | 24 | 24 | |
7B | — | 24 | — | 24 | 48 | 48 | 48 | 48 | |
7C | 48 | 48 | 48 | 48 | 48 | 48 | 48 | 48 | |
7D | 36 | 36 | 36 | 36 | 36 | 48 | 36 | 48 | |
8A | 24 | 24 | 24 | 24 | 24 | 36 | 24 | 36 | |
8B | — | — | — | — | — | 48 | — | 48 | |
8C | 48 | 48 | 48 | 48 | 48 | 48 | 48 | 48 | |
8D | 24 | 24 | 24 | 24 | 24 | 48 | 24 | 48 | |
Total | 360 | 432 | 360 | 432 | 552 | 696 | 552 | 696 |
5.6.4. Modular I/O Banks for Arria V SX Devices
Member Code | B3 | B5 | |||||
---|---|---|---|---|---|---|---|
Package | F896 | F1152 | F1517 | F896 | F1152 | F1517 | |
FPGA I/O Bank | 3A | 44 | 44 | 48 | 44 | 44 | 48 |
3B | 28 | 28 | 32 | 28 | 28 | 32 | |
3C | — | 38 | 48 | — | 38 | 48 | |
3D | 13 | 13 | 48 | 13 | 13 | 48 | |
4A | 42 | 42 | 48 | 42 | 42 | 48 | |
4B | — | 38 | 48 | — | 38 | 48 | |
4C | — | 26 | 32 | — | 26 | 32 | |
4D | — | 32 | 48 | — | 32 | 48 | |
HPS Row I/O Bank | 6A | 56 | 56 | 56 | 56 | 56 | 56 |
6B | 44 | 44 | 44 | 44 | 44 | 44 | |
HPS Column I/O Bank | 7A | 32 | 32 | 32 | 32 | 32 | 32 |
7B | 22 | 22 | 22 | 22 | 22 | 22 | |
7C | 12 | 12 | 12 | 12 | 12 | 12 | |
7D | 20 | 20 | 20 | 20 | 20 | 20 | |
7E | 8 | 8 | 8 | 8 | 8 | 8 | |
FPGA I/O Bank | 7G | — | — | 12 | — | — | 12 |
8A | 44 | 44 | 48 | 44 | 44 | 48 | |
8B | 28 | 28 | 32 | 28 | 28 | 32 | |
8C | 38 | 38 | 48 | 38 | 38 | 48 | |
8D | 13 | 14 | 48 | 13 | 14 | 48 | |
Total | 444 | 579 | 734 | 444 | 579 | 734 |
5.6.5. Modular I/O Banks for Arria V ST Devices
Member Code | D3 | D5 | |||||
---|---|---|---|---|---|---|---|
Package | F896 | F1152 | F1517 | F896 | F1152 | F1517 | |
FPGA I/O Bank | 3A | 44 | 44 | 48 | 44 | 44 | 48 |
3B | 28 | 28 | 32 | 28 | 28 | 32 | |
3C | — | 38 | 48 | — | 38 | 48 | |
3D | 13 | 13 | 48 | 13 | 13 | 48 | |
4A | 42 | 42 | 48 | 42 | 42 | 48 | |
4B | — | 38 | 48 | — | 38 | 48 | |
4C | — | 26 | 32 | — | 26 | 32 |