Designing for Stratix 10 Devices with Power in Mind

ID 683058
Date 6/14/2016
Public

1.1. Power Optimization Techniques and Recommendations

Device Availability with Power Option

The suffix after the speed grade in part number denotes the power options offered in Stratix® 10 devices:
  • V—SmartVID
  • L—Low Power (Fixed voltage)
  • X—Extreme Low Power (Fixed voltage)
L devices have 0.85V fixed voltage and are binned for low static power. These are speed grade 2 devices.

X devices have 0.8V fixed voltage and are binned for the lowest static power. These are speed grade 3 devices.

SmartVID devices have “standard” static power. These are speed grade 1, 2, and 3 devices.

SmartVID

The SmartVID feature compensates the process variation by narrowing the process distribution using voltage adaptation. Instead of a constant voltage, SmartVID-enabled devices will opportunistically adjust the voltage of the device for optimal power while at the same time meeting its performance goals. To save power, voltage is reduced on devices that have performance in excess of what is required to meet specifications.

SmartVID allows a power regulator to provide the Stratix® 10 devices with lower VCC and VCCP voltage levels while maintaining the performance of the specific device speed grade. When SmartVID is used, Stratix® 10 devices must be powered up to a default voltage level for both VCC and VCCP. After the VID value in the Stratix® 10 device is determined and propagated to the external voltage regulator, both the VCC and VCCP voltages are regulated based on the VID value. SmartVID voltages can vary between 0.8V and 0.94V, in 10mV increments. For more information, refer to the Stratix® 10 Power Management User Guide.

DSP Power Gating

Stratix® 10 devices support static power gating for the DSP blocks, which eliminates their static power consumption when they are not used. The Quartus® Prime software automatically formulates static power gating for DSP blocks that are not used. Power gating of the DSP blocks is enabled via the Configuration RAM (CRAM) bits.

Stratix® 10 devices also support DSP partial reconfiguration. The Quartus® Prime software generates a bitstream that powers up DSP blocks as required during partial reconfiguration.

Intel recommends using built-in DSP registers whenever possible for optimal power savings. A study of various designs that used built-in DSP registers versus none showed upto 50% decrease in power consumption.

M20K Power Gating

The Stratix® 10 M20K memory block can also be static power gated. Each half of the memory array can be powered down via PMOS sleep devices that power them. The Quartus® Prime software uses this feature to shut down the power supply for the parts of a memory array that are not used.

The Quartus® Prime software generates a bitstream that powers up M20K memory blocks as required during partial reconfiguration.

The mode of a M20K block can influence its power consumption. As shown below, for the same number of memory blocks (8500 M20K blocks) and toggle rate (40%), power consumption depends on the respective memory type.

Figure 1. M20K Power Consumption Comparison For Different Configurations

Clock Gating

Clock gating can reduce dynamic power consumption. When an application is idle, its clock can be gated temporarily and ungated based on wake-up events. You can achieve dynamic power reduction by gating the clock signals to any circuitry that is determined to be inactive as per your design requirements. Clock gating can be performed at the following levels:
  • Root Clock Gate

    There is one root clock gate per I/O bank and transceiver bank. This gate is part of the periphery DCM (Distributed Clock Multiplexer) and is located close to the clock buffer. The Stratix® 10 root clock gate is intended for limited clock gating scenarios where high insertion delays can be tolerated. When you enable a root clock gate, expect a delay of several clock cycles between the assertion of the clock gate and the corresponding change on the output clock signal. For a high frequency clock, use the SCLK (sector clock) gating. For more information, refer to the Stratix 10 Clocking and PLL User Guide.

  • Sector Clock Gate

    Every Stratix® 10 FPGA is divided into sectors. Each sector has its own clock network which provides more flexibility. The section clock gating is done at the SCLK multiplexor level. There are 32 SCLKs in each sector of the device. Each SCLK has a clock gate and by-passable clock gate path. The SCLK gates are controlled by clock enable inputs from the core logic. The Quartus® Prime software can route up to eight different clock enable signals to the 32 SCLKs in a sector. The clock signal going into the SCLK network in a sector can only reach the core logic in that sector.

    When you instantiate a SCLK gate in your design, the Quartus® Prime software automatically duplicates the SCLK gate to create a clock gate in every sector to which the clock signal is routed. The SCLK gate is suitable for cycle-specific clock gating for high-frequency clocks. The timing of the path to SCLK gate is analyzed by the Quartus® Prime software.

  • I/O PLL Clock Gate

    Each output counter of the Stratix® 10 I/O PLL can be dynamically gated. This provides a useful alternative to the root clock gate as the root clock can only gate one out of the nine output counters.

    However, the I/O PLL clock gate is not cycle-specific. While using the I/O PLL clock gate, expect a delay of several clock cycles between the assertion or deassertion of the clock gate and the corresponding change to the clock signal. The number of delay cycles is non-deterministic because the enabled signal must be synchronized into the clock domain of the output clock. This ensures a glitch-free gate. For more information, refer to the Stratix 10 Clocking and PLL User Guide.

Power Savings while Using Transceivers

Stratix® 10 devices feature power-efficient, high-bandwidth, and low latency transceivers. For optimal static and dynamic power savings, Intel recommends using the lowest transceiver voltage (VCCR / T_GXB) that supports your respective data rate and protocol requirements..

Did you find the information on this page useful?

Characters remaining:

Feedback Message