2.7.6. Retiming Restrictions and Workarounds
In the diagram of a simple critical chain that follows, the red line represents the same critical chain. Timing restrictions prevent register A from retiming forward. Timing restrictions also prevent register B from retiming backwards. A loop occurs when register A and register B are the same register.
Fast Forward recommendations for the critical chain include:
- Reduce the delay of ‘Long Paths’ in the chain. Use standard timing closure techniques to reduce delay. Combinational logic, sub-optimal placement, and routing congestion, are among the reasons for path delay.
- Insert more pipeline stages in ‘Long Paths’ in the chain. Long paths are the parts of the critical chain that have the most delay between registers.
- Increase the delay (or add pipeline stages to ‘Short Paths’ in the chain).
Particular registers in critical chains can limit performance for many other reasons. The Compiler classifies the following types of reasons that limit further optimization by retiming:
- Insufficient Registers
- Short path/long path
- Path limit
After understanding why a particular critical chain limits your design’s performance, you can then make RTL changes to eliminate that bottleneck and increase performance.
|Design Condition||Hyper-Register Support|
|Initial conditions that cannot be preserved||Hyper-Registers do have initial condition support. However, you cannot perform some retiming operations while preserving the initial condition stage of all registers (that is, the merging and duplicating of Hyper-Registers). If this condition occurs in the design, the Fitter does not retime those registers. This retiming limit ensures that the register retiming does not affect design functionality.|
|Register has an asynchronous clear||Hyper-Registers support only data and clock inputs. Hyper-Registers do not have control signals such as asynchronous clears, presets, or enables. The Fitter cannot retime any register that has an asynchronous clear. Use asynchronous clears only when necessary, such as state machines or control logic. Often, you can avoid or remove asynchronous clears from large parts of a datapath.|
|Register drives an asynchronous signal||This design condition is inherent to any design that uses asynchronous resets. Focus on reducing the number of registers that are reset with an asynchronous clear.|
|Register has don’t touch or preserve attributes||The Compiler does not retime registers with these attributes. If you use the preserve attribute to manage register duplication for high fan-out signals, try removing the preserve attribute. The Compiler may be able to retime the high fan-out register along each of the routing paths to its destinations. Alternatively, use the dont_merge attribute. The Compiler retimes registers in ALMs, DDIOs, single port RAMs, and DSP blocks.|
|Register is a clock source||This design condition is uncommon, especially for performance-critical parts of a design. If this retiming restriction prevents you from achieving the required performance, consider whether a PLL can generate the clock, rather than a register.|
|Register is a partition boundary||This condition is inherent to any design that uses design partitions. If this retiming restriction prevents you from achieving the required performance, add additional registers inside the partition boundary for Hyper-Retiming.|
|Register is a block type modified by an ECO operation||This restriction is uncommon. Avoid the restriction by making the functional change in the design source and recompiling, rather than performing an ECO.|
|Register location is an unknown block||This restriction is uncommon. You can often work around this condition by adding extra registers adjacent to the specified block type.|
|Register is described in the RTL as a latch||Hyper-Registers cannot implement latches. The Compiler infers latches because of RTL coding issues, such as incomplete assignments. If you do not intend to implement a latch, change the RTL.|
|Register location is at an I/O boundary||All designs contain I/O, but you can add additional pipeline stages next to the I/O boundary for Hyper-Retiming.|
|Combinational node is fed by a special source||This condition is uncommon, especially for performance-critical parts of a design.|
|Register is driven by a locally routed clock||Only the dedicated clock network clocks Hyper-Registers. Using the routing fabric to distribute clock signals is uncommon, especially for performance-critical parts of a design. Consider implementing a small clock region instead.|
|Register is a timing exception end-point||The Compiler does not retime registers that are sources or destinations of .sdc constraints.|
|Register with inverted input or output||This condition is uncommon.|
|Register is part of a synchronizer chain||The Fitter optimizes synchronizer chains to increase the mean time between failure (MTBF), and the Compiler does not retime registers that are detected or marked as part of a synchronizer chain. Add more pipeline stages at the clock domain boundary adjacent to the synchronizer chain to provide flexibility for the retiming. Alternatively, you can reduce the detection number for that particular synchronizer chain Synchronization Register Chain Length (default is 3). In some cases a synchronizer chain isn't necessary, and shouldn't be inferred.|
|Register with multiple period requirements for paths that start or end at the register (cross-clock boundary)||This situation occurs at any cross-clock boundary, where a register latches data on a clock at one frequency, and fans out to registers running at another frequency. The Compiler does not retime registers at cross-clock boundaries. Consider adding additional pipeline stages at one side of the clock domain boundary, or the other, to provide flexibility for retiming.|