Improving Quality of Results with Design Assistant
Optimizing your design’s
source code is typically the first and most effective technique for improving the quality of
results. Design Assistant is a
convenient tool that allows you to identify potential issues earlier. Design Assistant runs
targeted sanity checks and provides guidance at each stage, thereby reducing the total number
of iterations for design closure.
conveniently identifies potential design issues, including circuit functionality and timing
performance. Identifying and fixing issues early in the design cycle gives fewer and faster
You save time running a full compilation if you solve issues in synthesis or in plan and place
Quartus® Prime Pro Edition provides more DRC rules and in
more stages. Design Assistant allows the flexibility to choose which rules to run, at what
compile stages, and to select and filter rules of interest.
Figure 1. Design Assistant ReportDesign Assistant provides reports in each of the compilation stages when
you run it.
Some of the issues that Design Assistant can detect include:
Metastability because of clock and reset domain crossings
Excessive logic levels
nets that can cause congestion
Potential problems with the design's power-on strategy
Retiming Restrictions that prevent the Hyper-Retimer from making
For faster iterations, run the rules on early-compilation snapshots. For fewer iterations,
run lots of rules on the snapshots you have to catch and address all kinds of issues.
About the Design Assistant Design Example
Common user errors can result in on-board failures. This design shows those mistakes and how you catch them with Design Assistant.
The design contains two modules, each in a different clock domain. The design contains high fan-out signals, some congestion, and is missing Synopsis design constraints (SDCs) and proper clock domain crossing. By fixing these issues and adding some hyper-pipelining, the design can meet timing.
Download the design (an919.zip) from www.intel.com.
The design example consists of the following directories:
base – the original design
intermediate – the original design and fixes for SDC, CDC, RDC
final – the intermediate design and additional optimizations
Each directory has the following files:
top.qpf – project file
top.qsf – Intel Quartus Prime settings file
top.sv – top-level Verilog HDL file
top_clk1.sv – submodule on one of the two clock domains
Add synchronization logic for data going between clk1 and clk2 and vice versa by changing top.sv and top_clk2.sv. The following code shows the CDC for going from clk1 to clk2.
// generate a pulse in one clock domain
en_q <= en;
// detect the positive edge of the pulse and change edges
// this edge crosses the clock domain
always@(posedge clk1 or posedge srst_clk1)
en_pulse_det0q <= 1'b0;
else if(en & ~en_q)
en_pulse_det0q <= !en_pulse_det0q;
// send en_pulse_det0q into the en input of top_clk2.sv
logic en0q, en1q, en2q;
// synchronize the detected edge into the clk2 domain and use it
// to generate a pulse
always@(posedge clk2) begin
en0q <= en;
en1q <= en0q;
en2q <= en1q;
en2_pulse <= en2q ^ en1q;
// use the pulse to enable data capture
always@(posedge clk2) begin
idat_q <= idat;
Add SDCs to constrain the CDC correctly. Rules CDC-50002 and CDC-50003 detect missing SDCs.
Refer to Clock Domain Crossing and Reset Domain Crossing Rule for more information about these and other related rules.
Remove inferred latch in RTL. Rule TMC-20018 catches the latch. Rule CLK-30026 catches that the design is using the en signal as a clock for the inferred latch.
// TMC-20018 inferred latch; CLK-30026 flags the en signal as a clock
inferred_lat <= math1;
The fix is:
always@(posedge clk1) begin
inferred_lat <= math1;
In the base design, check for any TMC-20200 and TMC-20201 rule failures, which may be due to incorrect SDCs and improper CDCs. However, the fixes made in the previous steps should resolve them in this example design.
Refer to Intrinsic Margin for more information on these and other related rules.
Fixing Other Faults in the Design Assistant Design Example
you correctly constrain the design and add CDCs, you see fewer rule violations, but the
design still needs further optimizations.
section, refer to the final RTL, which contains many optimizations. Compile the design in
the final directory to see the effects of all the changes in this
seed variation, results may differ compared to the screenshots in this
Open the design in
Quartus® Prime Pro Edition: click File > Open Project and open the top.qpf project file from the intermediate directory.
Under Design Assistant Rule Settings, turn on Enable Design Assistant execution during compilation.
Compile the design.
Analyze the Design Assistant results.
The report shows fewer rule violations. The missing input delay is on an asynchronous pin and is safe.
Figure 6. Design Assistant (Signoff) Results
Observe the failing rules during the fitter plan stage. These failing rules point to high fan-out and congestion issues.
Figure 7. Fitter Plan Stage Failing Rules
Reduce the high fan-out nets in the design (refer to Reducing High Fan-out Nets). Design Assistant can identify high fan-out nets that are candidates for duplication.
Figure 8. TMC-20601 Registers with High Immediate Fan-out
Figure 9. TMC-20602 Registers with High Timing Path Endpoint
Divide the memory into stripes to allow for duplication of control signals.
Figure 10. Memory Control SignalsTMC-20551 identifies some control signals that may need further duplication. It is the write address of a large memory.
Explicitly duplicate registers in the RTL.
Apply the Manual Register Duplication assignments on some of these high fan-out registers.
Duplicate the synchronous reset signal to reduce fan-out.
Observe rule TMC-20204 shows the design setup-failing endpoints with retiming restrictions. Fast-forward compilation can provide recommendations.
Figure 12. Fast-forward Compilation
Refer to Hyper-Timing Restrictions for rules that indicate where hyper-retiming is limited. In the design, the timing endpoints are in M20Ks, which is a block that you cannot retime.
The final RTL removes asynchronous resets and creates a synchronous reset tree. It also adds additional pipeline stages throughout the design including the inputs to memories and on the synchronous reset tree.
Remove initial power-up conditions.
The design has a signal that didn't reset but controls some other logic, which can lead to functional issues. Refer to the Initial Power-Up Conditions Rules for rules that detect initial conditions in the design.
RES-30132 detects registers that may not be reset. Review the results to see if design reset the signal and determine if it should be reset.
Figure 13. RES-30132
Figure 14. Example of Register to Reset
Can start at any value and should be reset.
Add a Reset Release IP.
Figure 15. Reset Release
The output of the Reset Release IP can gate internal clocks and resets to prevent race conditions. Refer to Reset Release IP for more information. The final RTL includes an instance of the reset release IP.
Run on the
planned snapshot and detect paths with extreme negative slacks. The rules catch any SDC
issues that create excessive timing requirements as early as
at these rules to save a lot of compilation time on seeds that never pass timing.
TMC-20001 – Timing Paths with Hold Slack Exceeding threshold
TMC-20002 – Timing Paths with Removal Slack exceeding threshold
TMC-20004 – Timing Paths with Setup Slack exceeding Threshold
TMC-20005 – Timing Paths with Recovery Slack Exceeding Threshold
Table 1. High-Priority Timing Closure RulesDesign Assistant runs these timing closure rules on the final
netlist, which helps identify incorrect SDC assignments, latches, or
combinational loops in the design. These rules are in timing analyzer under
Possible course of action
Incomplete I/O delay assignment
Add the missing options to the delay assignment
or modify the clock source
Partial multicycle assignment
Verify that each setup multicycle assignment has
a corresponding hold multicycle assignment and vice versa
Invalid reference pin
Modify the -reference_pin option of the delay
assignment to be the direct fan-out of the clock that you specify in
the same assignment
Inconsistent min/max delay
Modify the delay values to ensure that the min
delay does not exceed the max delay
Partial output delay
Verify that output delays have the rise-min,
fall-min, rise-max, and fall-max specification
Partial input delay
Verify that input delays have the rise-min,
fall-min, rise-max, and fall-max specification
Missing output delay
Verify that every output port has an output delay
Missing input delay
Verify that every input port has an input delay
Checks for inferred latches in synthesis
Remove any unintended inferred latches from the
Remove any unintended inferred latches from the
Clock Domain Crossing and Reset Domain Crossing Rules
crossing clock-domains can result in functional failures and can be difficult to trace
and debug. Additionally, incorrectly setting constraints when crossing clock-domains can
result in long run times and impossible timing results.
following CDC rules detect if the datapath signals are going through correct synchronization
logic and are being properly constrained.
CDC-50001 – single-bit asynchronous transfer is not
This rule checks that a single-bit asynchronous transfer has the
proper synchronization circuitry.
Figure 19. Unsynchronized 1-bit Asynchronous TransferTo prevent a CDC-50001 violation, the blue register in the
following figure must be followed by at least one other register also latched by
In the example, the output of the blue register should feed another
blue register to better protect against metastability.
CDC-50002 – single-bit asynchronous transfer
is missing timing constraint
Quartus® Prime Pro Edition from
analyzing the paths between clock domains, relax setup and hold requirements on this
path. Use a set_false_path, set_clock_groups (asynchronous), or a large set_max_delay and a large negative set_min_delay.
CDC-50003 – multi-bit data transfer is missing skew and delay
Multibit data transfers need to have skew and delay constraints. The
skew constraints ensure that related data arrives together. The recommendation is to
set set_max_skew to be less than one launch clock.
To satisfy CDC-50002, setup and hold requirements on these paths have been relaxed.
To ensure data arrives in a reasonable amount of time, apply a set_net_delay on these paths. Intel recommends that you
set set_net_delay to be less than one latch
RES-50001 and RES-50002 Reset Synchronization Rules
Deassert asynchronous resets synchronously, to prevent metastability.
The following reset domain crossing rules help detect improper reset
RES-50001 – asynchronous reset is not synchronized
RES-50002 – asynchronous reset is insufficiently
Example code for a reset synchronizer:
module safe_reset_sync (input external reset,
logic q1, q2;
always@(posedge clock or negedge external_reset)
if(external_reset == 1’b0) begin
q1 <= 1’b0;
q2 <= 1’b0;
end else begin
q1 <= 1’b1;
q2 <= q1;
assign external_reset = q2;
The output of this module can drive the resets of registers. The
design needs constraints to cut the timing path from the asynchronous reset to the
synchronizer reset pins.
RES-50003 – asynchronous reset is missing timing
Similar to rule CDC-50002, the reset synchronizer should have timing
constraints to prevent Timing Analyzer from analyzing these paths. Use set_false_path, set_clock_groups (asynchronous), or a set_max_delay larger than the latch clock period on transfer from the
asynchronous reset source to the registers’ async
RES-50004 – multiple asynchronous resets within reset
Ensure that all asynchronous resets in a reset synchronizer chain
have a common source.
with impossible requirements. Any given datapath in the design has an intrinsic margin
that doesn’t depend on place and route but which the design’s RTL and SDCs determine.
The clock relationship between launch and latch clocks, clock uncertainty, and by
uTsu and Tco time drive these intrinsic margin
The Intrinsic margin rules help diagnose why a path may be failing
setup timing. For example, too much clock skew, routing delay, or too many logic
levels. Timing violations might not violate these rules, but if they do, they
provide guidance on what you can do to fix them.
Table 2. Intrinsic Margin Rules
Possible course of action
Setup-Failing Paths with Impossible
These paths have a tight clock relationship.
Large differences in uTsu,
uTco, and, or significant
clock source uncertainty cause failures before any additional
delay is added.
Fix SDC constraints.
hard blocks are registered.
Setup-Failing Paths with High Clock Skew
These paths have such high clock skew that they
fail without any contribution from the datapath.
Apply clock region assignments.
Setup-Failing Paths with High Cell and Local
These paths have such high cell
delay that they fail without any contribution from the clock
network or interconnect.
Reduce logic levels.
Setup-Failing Paths with High Fabric Interconnect
The path has such high interconnect
delay that it fails timing without any contribution from the
clock network or datapath logic.
Restructure RTL to ensure
Combinatorial Logic Levels
The number of
combinatorial logic between registers can increase the path delay and limit fMAX.
To reach higher clock speeds, reduce the levels of combinatorial
To reduce the levels
of combinatorial logic:
Add additional pipeline stages. With additional pipelining resources, the retimer can
reduce the logic levels by balancing the register chain.
Look for design optimizations that can reduce logic levels. For example, include
precomputing values, or computing more in parallel.
TMC-20010 detects paths levels above a given threshold on the worst timing paths.
Reducing High Fan-out Nets
nets are difficult to place optimally and can reduce the fMAX of your design. They often
lead to congestion and can result in long-path and short-path imbalances that limits
retiming in critical chains.
While these signals may not be critical, they
can span large distances and warp the optimization of other paths around them.
Ensure you duplicate high fan-out driver registers:
Use the DUPLICATE_REGISTER and DUPLICATE_HIERARCHY_DEPTH assignments for automated solutions or
Edit the RTL to create duplicate copies.
If you edit the RTL, apply the preserve_syn_only attribute to the duplicate registers, and assign the duplicates to individual instances in the fan-out hierarchy.
The compiler automatically promotes recognized high fan-out nets to the global clock network. It also makes a higher optimization effort during place and route stages to duplicate registers.
Quartus® Prime to automatically create duplicate register trees based on estimated physical proximity or based on hierarchy.
Apply the following QSF assignment to duplicate registers based on estimated physical proximity:
Register name is the last register in the chain that fans out to multiple hierarchies
Level number is the number of registers in the chain to duplicate.
Typically, set up a synchronous reset tree. Resets are often a significant source of high fan-out nets. The registers in the chain must satisfy the following conditions to be included in duplication:
Only another register must feed registers.
Combinatorial logic must not feed registers.
Registers must not be part of a synchronizer chain.
Registers must not have any secondary signals.
Registers must not have a preserve attribute or a PRESERVE_REGISTER assignment.
All registers in the chain except the last one must have only one fan-out.
The following rules detect an issue with hierarchical tree
TMC-20500 – Hierarchical Tree Duplication was Shallower Than
TMC-20501 – Hierarchical Tree Duplication was Shallower than
Refer to the Intel Quartus Prime Help for more information about these
If you are familiar with the design, you can make duplication
decisions. Duplicate the register directly in the RTL, using
dont_merge or preserve_syn_only to
prevent synthesis from optimizing them away. For example, duplicating high fan-out
broadcast signals by module.
The following are high fan-out rules:
HRR-10115 – Nets with fan out exceeding threshold
HRR-10003 – Registers with high fan-out non-globals
TMC-20051 – High fan-out net drives RAM control signals.
To reduce fan-out on a RAM control signal, divide the memory into
smaller chunks and send each duplicated control signal to its own memory. This
technique works on wide memories.
Figure 20. A 256x2048 Memory with Large Fan-out on a Write
Figure 21. Duplicated Write Enable and Split Memory to Reduce
Table 3. Duplication Rejection RulesThe following rules report if the tool rejects duplication or
if the design requires further duplication. The tool can reject duplication, for
a placement constraint or for other connectivity reasons. You can identify
issues with duplication in the planned netlist.
TMC-20550 – Duplicate Candidate Rejected for
Registers with a tight placement constraint such
as logic lock region, clock region, or location assignments cannot
be automatically duplicated. Relax the constraint to include the
register’s fan-outs or use the duplication techniques.
TMC-20551 – Automatically Discovered Duplication
Candidates Likely Requires More Duplication
duplicates candidate registers automatically. If fan out is still
large, apply further duplication techniques to reduce fan-out
tension and span rules help identify areas of the design that may be difficult for the
fitter to place, even if they are meeting timing.
These rules identify registers with sinks that are pulling it in
various directions, making it difficult to find an optimal placement.
Registers that fail these rules are good candidates for register
duplication. They are low priority rules. Your design can still be meeting timing
even if these rules are violating, but they help identify the areas of the design
that can contribute to congestion. When looking for places for further optimization,
these are good candidates.
Figure 23. TensionTension is the sum over each sink (either immediate fanout or timing endpoint,
depending on the rule) of the distance of the sink from the centroid of all the
Figure 24. Span Span is the maximum one-dimensional delta between the left-bottom-most sink
with the right-top-most sink.
The rules are:
TMC-20601 – Registers with High
TMC-20602 – Registers with High Endpoint Tension
TMC-20603 – Registers with High
TMC-20604 – Registers with High Endpoint Span
Duplicating these signals where possible allows
Quartus® Prime to find a better solution.
Figure 25. High Tension Fanout Signal
Hyper-Retimer balances register chains by retiming ALM registers into Hyper-Registers in the
The Design Assistant rules that identify retiming restrictions prevent the
Hyper-Retimer from making optimizations in Intel Agilex and Intel Stratix 10 devices, thus
limiting performance. Such restrictions include asynchronous resets, high
nets, timing exceptions, the preserve register attributes, and initial conditions. The Design
Assistant rules are:
TMC-20204 – nodes with retiming restrictions that may restrict retiming
TMC-20205 – nodes with initial conditions that may restrict retiming
HRR-10003 – registers with high
Note: TMC-20204 and TMC-20205 only look at the worst setup-failing paths in the design. They do
not report on retiming restrictions or initial conditions in off-critical logic.
forward compile identifies the critical chain and points out the next steps to take to remove
the retiming restrictions, if possible. For more information about hyper-retiming
restrictions, refer to the Intel Hyperflex Architecture High-Performance
condition of the design at power-up represents the state of the design at clock cycle 0.
This state is transitional rather than functional because the design cannot return to
Table 4. Initial Conditions RulesThe following Design Assistant rules identify initial
conditions in the design and provide you guidance.
Possible course of action
Catches registers whose initial conditions drop
because of IGNORE_REGISTER_POWER_UP_INITIALIZATION ON
Verify that the design is still functionally
correct without these initial conditions.
Registers might not be properly reset
Check to see if you need these registers.
Catches memories that may have spurious writes
because of initial conditions
Check to see that these registers are reset.
Power-up don’t care synthesis setting might
Remove power-up don’t care setting
Setup-failing path endpoints with explicit
power-up states that might restrict retiming
Use QSFs to ignore initial conditions
Remove initial conditions and use reset.
In the Design Assistant example design, count_q is not reset, which triggers rule RES-30132. Do not assume
Quartus® Prime sets count_q to 0 at power-up.
count_q <= count_q + 1'd1;
if(count_q == 4'hf)
wen_q <= 1'b1;
wen_q <= 1'b0;
Applying a reset to count_q so that
it is set to 0 at reset fixes the violation.
Intel Agilex and Intel Stratix 10 devices don’t power up uniformly
across all sectors. They power-up sector by sector. Some parts of the device may
have their initial conditions before other parts of the device, which can lead to
race conditions or spurious writes to memory during power-up.
you do not use initial power-up conditions in your RTL. Always reset the design into a
specify initial conditions in the RTL:
reg q = 1’b1; // q has a default value of ‘1’ always@(posedge clk) begin q <= d; end
Registers without resets may also have initial conditions. By
Quartus® Prime determines their FF
power-up values automatically.
this setting off and force
Quartus® Prime to set
all uninitialized registers to 0 at power up by using the following global
set_global_assignment ALLOW_POWER_UP_DONT_CARE OFF
Generally, Intel does not recommend this setting because it can hinder
Instead of using a global assignment, target areas of the design to drop
automatically generated initial conditions with:
IGNORE_REGISTER_POWER_UP_INITIALIZATION ON -to <instance name>
Reset Release Intel FPGA IP
Generates a signal to
indicate when the device configuration has finished. It is then safe to release reset
throughout the device.
Designs can use this IP to gate clocks and write enables or synchronize resets. The Reset
Release solves race conditions and spurious writes during power-up. Intel Stratix 10 and
Agilex devices require one instance of the Reset Release IP in the design. HRR-10204 is run
during synthesis and checks that there is exactly one instance of the Reset Release IP in the