In a typical design flow, the early stages of development concentrate
on meeting timing, area and power goals. Once the design meets those goals, the efforts
focus on improving performance. This chapter introduces techniques
and tools in the
Quartus® Prime software that you can use
to achieve the highest design performance.
Optimization of a FPGA design requires a multi-dimensional approach that meets the design goals
while reducing area, critical path delay, power consumption, and runtime. The
Quartus® Prime software includes advisors to address each of
these issues. By implementing the advisor's suggestions, you can reduce the time spent
on design iterations.
Intel® FPGAs have a unique timing model that contains delay
information for all physical elements in the device, such as combinational adaptive logic
modules, memory blocks, interconnects, and registers.
The delays encompass all
valid combinations of operating conditions for the target FPGA. Additionally, the device size
and package determine pin-out and the resource availability.
can vary significantly depending on the assignments and settings that you choose. In the
Quartus® Prime software, the default values for settings
and options provide the best trade-off between compilation time, resource utilization, and
Before compiling a design in the
Quartus® Prime software, consider the following guidelines.
Guidelines for I/O Assignments
In a FPGA design, I/O standards and
drive strengths affect I/O timing.
When specifying I/O assignments, make sure that the
Quartus® Prime software is using an accurate I/O timing delay for timing
analysis and Fitter optimizations.
If the PCB layout does not indicate pin locations, then leave the pin locations
unconstrained. This technique allows the Compiler to search for the best layout.
Otherwise, make pin assignments to constrain the compilation appropriately.
For best results, use real time requirements. Applying more demanding timing
requirements than the design needs can cause the Compiler to trade off by increasing
resource usage, power utilization, or compilation time.
Comprehensive timing requirement settings
achieve the best results for the following reasons:
Correct timing assignments enable the software to work
hardest to optimize the performance of the timing-critical parts of the design
and make trade-offs for performance. This optimization can also save area or
power utilization in non-critical parts of the design.
If enabled, the
Quartus® Prime software performs physical synthesis optimizations based
on timing requirements.
Quartus® Prime Timing Analyzer determines if the design implementation meets
the timing requirement. The Compilation Report shows whether the design meets the
timing requirements, while the timing analysis reporting commands provide detailed
information about the timing paths.
Many optimization goals can conflict
with one another, so you might need to resolve conflicting goals.
Table 1. Examples of Trade offs in Design Optimization
Resource usage and critical path timing.
Certain techniques (such as logic duplication) can improve timing
performance at the cost of increased area.
Power requirements can result in area and timing
For example, reducing the number of available high-speed tiles,
or attempting to shorten high-power nets at the expense of critical
System cost and time-to-market considerations can affect the
choice of device.
For example, a device with a higher speed grade or more clock
networks can facilitate timing closure at the expense of higher
power consumption and system cost.
Finally, constrains that are too severe limit design
feasibility as far as no possible solution for the selected device. If the Fitter
cannot resolve a design due to resource limitations, timing constraints, or power
constraints, consider rewriting parts of the HDL code.
By default, the
Quartus® Prime Fitter
might physically spread a design over the entire device to meet the set timing
If you prefer to optimize your design to use the smallest area,
you can change this behavior. If you require reduced area, you can enable
certain physical synthesis options to modify your netlist to create a more
area-efficient implementation, but at the cost of increased runtime and
To meet complex timing requirements involving
multiple clocks, routing resources, and area constraints, the
Quartus® Prime software offers a close interaction between synthesis, floorplan
editing, place-and-route, and timing analysis processes.
By default, the
Quartus® Prime Fitter works to meet the timing requirements, and
stops when the requirements are met. Therefore, realistic constraints are
crucial for timing closure.
Under-constrained designs can lead to
sub-optimal results. For over-constrained designs, the Fitter might
over-optimize non-critical paths at the expense of true critical paths. In
addition, area and compilation time may also increase.
For designs with high resource usage, the
Quartus® Prime Fitter might have trouble finding a
legal placement. In such circumstances, the Fitter automatically modifies
settings to try to trade off performance for area.
Quartus® Prime Fitter
offers advanced options that can help improve the design performance when
you properly set constraints. Use the Timing Optimization Advisor to
determine which options are best suited for the design.
In high-density FPGAs, routing accounts for a major part
of critical path timing. Because of this, duplicating or retiming logic can
allow the Fitter to reduce delay on critical paths. The
Quartus® Prime software offers push-button netlist
optimizations and physical synthesis options that can improve design
performance at the expense of considerable increases of compilation time and
area. Turn on only those options that help you keep reasonable compilation
times and resource usage. Alternately, you can modify the HDL to manually
duplicate or adjust the timing logic.
Many Fitter settings influence compilation time. Most of the default settings in the
Quartus® Prime software are set for reduced compilation time. You can modify these settings based on your project requirements.
Quartus® Prime software
supports parallel compilation in computers with multiple processors. This can reduce
compilation times by up to 15%.
Quartus® Prime software includes several advisors to help you optimize your
design and reduce compilation time.
The advisors provide recommendations based on the project settings and design constraints. Those recommendations can help you to fit the project, meet timing or power requirements, or improve the design performance.
The advisors organize the recommendations from general to specific. Where applicable, the categories are divided into of stages presented by complexity.
The Design Space Explorer II tool (DSE
II) provides an easy and efficient way for you to run experiments on your design
settings. You can run a single compilation locally on your PC or remotely using compute
The Design Space Explorer II tool
(Tools > Launch Design Space Explorer II) allows you to find optimal project settings for resource,
performance, or power optimization goals.Design Space Explorer II (DSE II) processes
a design using combinations of
and constraints, and reports the best settings for the design. You can take advantage of
the DSE II parallelization abilities to compile on multiple computers.
If a design is
close to meeting timing or area requirements, you can try different seeds with
the DSE II, and find one seed that meets timing or area requirements.
Figure 1. Design Space Explorer II
You can run DSE II at any step in the design process; however,
because large changes in a design can neutralize gains achieved from optimizing
Intel® FPGA recommends that
you run DSE II late in the design cycle.
In DSE II, an exploration point is a collection of
Analysis & Synthesis, Fitter, and placement settings, and a group of exploration points is
a design exploration. A design exploration can also include different
compiles the design using the settings corresponding to each exploration point.
When the compilation finishes, DSE II evaluates the performance data against
an optimization goal that you specify. You can direct the DSE II to optimize for
area, or power.
You can configure DSE II to take advantage of
your computing resources to run the design explorations.
In the DSE II GUI, the
Setup page contains the job launch options, and the
Status page allows you to monitor and control
DSE II supports running compilations on your
local computer or a remote host through LSF, SSH or Torque. For SSH,
can also define a comma-separated list of remote hosts.
If you have a laptop
or standard computer, you can use the single compilation feature to compile your design on a
workstation with higher computing performance and memory capacity.
When running on a compute farm, you can direct the DSE II
exit after submitting all the jobs while the compilations continue to run
until completion. Optionally, you can receive an e-mail when the compilations are
If you launch jobs using
the remote host must enable public and private key authentication.
encrypted with a pass phrase, the remote host must run the ssh key agent
to decrypt the private
key, so the
quartus_dse executable can access the
Note: Windows remote hosts require Cygwin's sshd
server and PuTTY.
DSE II provides a collection of predefined exploration spaces that focus on
what you want to optimize. Additionally, you can define a set of compilation seeds. The number
of explorations points is the number of seeds multiplied by the number of exploration
Note: The availability of predefined spaces
depends on the device family that the design targets.
In the DSE GUI, you specify these settings in the Exploration page.
DSE II compares the compilation results to determine the best
Quartus® Prime software settings for the design. The
Report page displays a summary of results.
In an exploration, DSE II selects the best worst-case slack value from
among all timing corners across all exploration points. If you want to optimize for
worst-case setup slack or hold slack, specify timing constraints in the
Quartus® Prime software.
By default, DSE II saves all the compilation data. You can save disk space
by limiting the type of files that you want to save after a compilation finishes. These
settings are in the Exploration page, Results
DSE II has reporting tools that help you quickly determine important design
metrics, such as worse-case slack, across all exploration points.
DSE II provides a performance data report for all points it explores and
saves the information in a project-name.dse.rpt file in the project directory. DSE II
archives the settings of the exploration points in
Quartus® Prime Archive Files (.qar).
Performing a Design Exploration with the DSE II Utility
Note: Before running DSE II, specify the timing
constraints for the design.
This description covers the type of settings that you need to define when you want to
run a design exploration. For details about all the options available in the GUI,
refer to the
Quartus® Prime Help.
To perform a design exploration with the DSE II tool:
Start the DSE II tool.
If you have an open project in the
Quartus® Prime software and
launch DSE II, a dialog box appears asking if you want to close the
Quartus® Prime software. Click Yes.
In the Project page, specify the project and revision
that you want to explore.
In the Setup page, specify whether you want to perform a local or a remote exploration, and set up the job launch.
In the Exploration page, specify optimization settings
The following revision history applies to this
Quartus® Prime Version
General topic reorganization.
how DSE II works, and the main steps to follow when
performing a design exploration.
Added mention to the Design Partition
Planner in Design Analysis topic.
Implemented Intel rebranding.
Removed statements about serial equivalence when
using multiple processors.
Changed instances of Quartus
II to Quartus Prime.
Updated location of Fitter Settings,
Analysis & Synthesis Settings, and Physical Synthesis
Optimizations to Compiler Settings.
Updated DSE II content.
Minor changes for
Added the information about
initial compilation requirements. This section was moved from the
Area Optimization chapter of the
Quartus® Prime Handbook. Minor updates to delineate division
of Timing and Area optimization chapters.
Removed survey link.
Changed to new document template.
No change to content.
Initial release. Chapter based on
topics and text in Section III of volume 2.
This chapter describes how you can use the
Quartus® Prime Netlist Viewers to analyze and debug your designs.
As FPGA designs grow in size and complexity, the ability to analyze, debug, optimize, and constrain your design is
critical. With today’s advanced designs, several design engineers are involved in coding and synthesizing different design blocks, making it difficult to
analyze and debug the design. The
Quartus® Prime RTL Viewer and Technology Map Viewer provide powerful ways to view your
initial and fully mapped synthesis results during the debugging, optimization, and constraint entry processes.
When to Use the Netlist Viewers: Analyzing Design Problems
You can use the Netlist Viewers to analyze and debug your design. The following simple examples show how to use the
RTL Viewer and Technology Map Viewer to analyze problems encountered in the design process.
Using the RTL Viewer is a good way to view your initial synthesis results to determine whether you have created the
necessary logic, and that the logic and connections have been interpreted correctly by the software. You can use the RTL Viewer to check your design
visually before simulation or other verification processes. Catching design errors at this early stage of the design process can save you valuable time.
If you see unexpected behavior during verification, use the RTL Viewer to trace through the netlist and ensure that
the connections and logic in your design are as expected. Viewing your design helps you find and analyze the source of design problems. If your design
looks correct in the RTL Viewer, you know to focus your analysis on later stages of the design process and investigate potential timing violations or
issues in the verification flow itself.
You can use the Technology Map Viewer to look at the results at the end of Analysis and Synthesis. If you have compiled your design
through the Fitter stage, you can view your post‑mapping netlist in the Technology Map Viewer (Post-Mapping) and your post‑fitting netlist in the
Technology Map Viewer. If you perform only Analysis and Synthesis, both the Netlist Viewers display the same post‑mapping netlist.
In addition, you can use the RTL Viewer or Technology Map Viewer to locate the source of a particular signal, which can help you debug
your design. Use the navigation techniques described in this chapter to search easily through your design. You can trace back from a point of interest
to find the source of the signal and ensure the connections are as expected.
The Technology Map Viewer can help you locate post‑synthesis nodes in your netlist and make assignments when optimizing your design.
This functionality is useful when making a multicycle clock timing assignment between two registers in your design. Start at an I/O port and trace
forward or backward through the design and through levels of hierarchy to find nodes of interest, or locate a specific register by visually inspecting
Throughout your FPGA design, debug, and optimization stages, you can use all of the netlist viewers in many ways to increase your
productivity while analyzing a design.
Intel Quartus Prime Design Flow with the Netlist Viewers
When you first open one of the Netlist Viewers after compiling
the design, a preprocessor stage runs automatically before the Netlist Viewer
Click the link in the preprocessor process box to go to the Settings > Compilation Process Settings page where you can turn on the Run Netlist Viewers
preprocessing during compilation option. If you turn this option on, the
preprocessing becomes part of the full project compilation flow and the Netlist Viewer opens
immediately without displaying the preprocessing dialog box.
Quartus® Prime Design Flow Including the RTL Viewer and Technology Map Viewer
This figure shows how Netlist Viewers fit into the basic
Quartus® Prime design flow.
Before the Netlist Viewer can run the preprocessor stage, you must compile your design:
To open the RTL Viewer first perform Analysis and Elaboration.
To open the Technology Map Viewer (Post-Fitting) or the Technology Map Viewer (Post‑Mapping),
first perform Analysis and Synthesis.
The Netlist Viewers display the results of the last successful compilation.
Therefore, if you make a design change that causes an error during Analysis and Elaboration, you cannot view the netlist for the new design files,
but you can still see the results from the last successfully compiled version of the design files.
If you receive an error during compilation and you have not yet successfully run the appropriate compilation stage for your project, the Netlist
Viewer cannot be displayed; in this case, the
Quartus® Prime software issues an error message when you try to open the
Note: If the Netlist Viewer is open when you start a new compilation, the Netlist Viewer closes
automatically. You must open the Netlist Viewer again to view the new design netlist after compilation completes successfully.
RTL Viewer Overview
The RTL Viewer allows you to view a register transfer level (RTL) graphical representation of
Quartus® Prime Pro Edition synthesis results or third-party netlist files in the
Quartus® Prime software.
You can view results after Analysis and Elaboration for designs that use any supported
Quartus® Prime design entry method, including Verilog HDL Design Files (.v), SystemVerilog Design Files (.sv), VHDL Design Files (.vhd), AHDL Text Design Files (.tdf), or schematic Block Design Files (.bdf).
You can also view the hierarchy of atom primitives (such as device logic cells and I/O ports) for designs that generate Verilog Quartus Mapping File (.vqm) or Electronic Design Interchange Format (.edf) files through a synthesis tool.
The RTL Viewer displays a schematic view of the design netlist after Analysis and Elaboration or after the
Quartus® Prime software performs netlist extraction, but before technology mapping and synthesis or fitter optimizations. This view a preliminary pre-optimization design structure and closely represents the original source design.
For designs synthesized with
Quartus® Prime Pro Edition synthesis, this view shows how the
Quartus® Prime software interprets the design files.
For designs synthesized with a third-party synthesis tool, this view shows the netlist that the synthesis tool generates.
To run the RTL Viewer for a
Quartus® Prime project,
first analyze the design to generate an RTL netlist. To analyze the design and
generate an RTL netlist, click Processing > Start > Start Analysis & Elaboration. You can also perform a full compilation on any process that includes
the initial Analysis and Elaboration stage of the
Quartus® Prime compilation flow.
To open the RTL Viewer, click Tools > Netlist Viewers > RTL Viewer.
While displaying a design, the RTL Viewer optimizes the netlist to maximize readability:
Removes logic with no fan-out (unconnected output) or fan-in (unconnected inputs) from the display.
Hides default connections such as VCC and GND.
Groups pins, nets, wires, module ports, and certain logic into buses where appropriate.
Groups constant bus connections.
Displays values in hexadecimal format.
Converts NOT gates into bubble inversion symbols in the schematic.
Merges chains of equivalent combinational gates into a single gate; for example, a 2-input AND gate feeding a 2-input AND gate is converted to a single 3-input AND gate.
Running the RTL Viewer
To run the RTL Viewer for an
Quartus® Prime project:
Analyze the design to generate an RTL netlist by clicking Processing > Start > Start Analysis & Elaboration.
You can also perform a full compilation on any process that includes the initial Analysis and Elaboration stage of the
Quartus® Prime compilation flow.
Open the RTL Viewer by clicking Tools > Netlist Viewers > RTL Viewer.
Technology Map Viewer Overview
Quartus® Prime Technology Map Viewer provides a technology‑specific, graphical representation of FPGA designs after Analysis and Synthesis or after the Fitter maps the design into the target device.
The Technology Map Viewer shows the hierarchy of atom primitives (such as device logic cells and I/O ports) in the design. For supported device families, you can also view internal registers and look-up tables (LUTs) inside logic cells (LCELLs), and registers in I/O atom primitives.
Where possible, the
Quartus® Prime software maintains the port names of each hierarchy throughout synthesis. However, the software may change or remove port names from the design. For example, the software removes ports that are unconnected or driven by GND or VCC during synthesis. If a port name changes, the software assigns a related user logic name in the design or a generic port name such as IN1 or OUT1.
You can view
Quartus® Prime technology-mapped results after synthesis, fitting, or timing analysis. To run the Technology Map Viewer for a
Quartus® Prime project, on the Processing menu, point to Start and click Start Analysis & Synthesis to synthesize and map the design to the target technology. At this stage, the Technology Map Viewer shows the same post-mapping netlist as the Technology Map Viewer (Post‑Mapping). You can also perform a full compilation, or any process that includes the synthesis stage in the compilation flow.
For designs that completed the Fitter stage, the Technology Map Viewer shows how the Fitter changed the netlist through physical synthesis optimizations, while the Technology Map Viewer (Post‑Mapping) shows the post-mapping netlist. If you have completed the Timing Analysis stage, you can locate timing paths from the Timing Analyzer report in the Technology Map Viewer.
To open the Technology Map Viewer, click Tools > Netlist Viewers > Technology Map Viewer (Post-Fitting) or Technology Map Viewer (Post Mapping).
The Netlist Viewer is a graphical user-interface for viewing and
manipulating nodes and nets in the netlist.
The RTL Viewer and Technology Map Viewer each consist of these main
The Netlist Navigator pane—displays a representation of
the project hierarchy.
The Find pane—allows you to find and locate specific
design elements in the schematic view.
The Properties pane displays the properties of the
selected block when you select Properties
from the shortcut menu.
The schematic view—displays
a graphical representation of the internal structure of the design.
Figure 3. RTL Viewer
Netlist Viewers also contain a toolbar that provides tools to use in the
Use the Back and Forward buttons to switch between schematic views. You can go
forward only if you have not made any changes to the view since going back. These
commands do not undo an action, such as selecting a node. The Netlist Viewer caches
up to ten actions including filtering, hierarchy navigation, netlist navigation, and
The Refresh button to restore the schematic view and
optimizes the layout. Refresh does not reload
the database if you change the design and recompile.
Click the Find button opens and closes the Find pane.
Click the Selection Tool and Zoom
Tool buttons to alternate between the selection mode and zoom
Click the Fit in Page button resets the schematic view to
encompass the entire design.
Use the Hand Tool to change
the focus of the viewer without changing the perspective.
Click the Area Selection Tool
to drag a selection box around ports, pins, and nodes in an area.
Click the Netlist Navigator button to open or close the
Netlist Navigator pane.
Click the Color Settings button to open the Colors pane where you can customize the Netlist
Viewer color scheme.
Click the Display Settings
button to open the Display pane where you can
specify the following settings:
Show full name or
Show only <n> characters. You can specify this
separately for Node name, Port name, Pin
name, or Bus
Turn Show timing info
on or off.
Turn Show node type
on or off.
Turn Show constant
value on or off.
Turn Show flat nets
on or off.
Figure 4. Display Settings
The Bird's Eye View button opens the Bird's Eye View window which displays a miniature
version of the design and allows you to navigate within the design and adjust the
magnification in the schematic view quickly.
The Show/Hide Instance Pins button can alternate the
display of instance pins not displayed by functions such as cross-probing between a
Netlist Viewer and Timing Analyzer. You can also use this button to hide unconnected
instance pins when filtering a node results in large numbers of unconnected or
unused pins. The Netlist Viewer hides Instance pins by default.
If the Netlist Viewer
display encompasses several pages, the Show Netlist on
One Page button resizes the netlist view to a single page. This
action can make netlist tracing easier.
You can have only one RTL Viewer, one Technology Map
Viewer (Post-Fitting), and one Technology Map Viewer (Post-Mapping) window open at the
same time, although each window can show multiple pages, each with multiple tabs. For
example, you cannot have two RTL Viewer windows open at the same time.
The Netlist Navigator pane displays the entire netlist in a tree format based on the hierarchical levels of the design. Each level groups similar elements into subcategories.
The Netlist Navigator pane allows you to traverse through the design hierarchy to view the logic schematic for each level. You can also select an element in the Netlist Navigator to highlight in the schematic view.
Note: The Netlist Navigator pane does not list nodes inside atom primitives.
For each module in the design hierarchy, the Netlist Navigator pane displays the applicable elements listed in the
following table. Click the “+” icon to expand an
Table 3. Netlist Navigator Pane Elements
Modules or instances in the design that can be expanded to lower
Low-level nodes that cannot be expanded to any lower hierarchy
level. These primitives include:
Registers and gates
that you can view in the RTL Viewer when using
Quartus® Prime Pro Edition synthesis.
Logic cell atoms in
the Technology Map Viewer or in the RTL Viewer when using a VQM or EDIF from
third-party synthesis software
In the Technology Map Viewer, you can view the internal
implementation of certain atom primitives, but you cannot traverse into a
lower-level of hierarchy.
The I/O ports in the current level of hierarchy.
Pins are device I/O
pins when viewing the top hierarchy level and are I/O ports of the design when
viewing the lower-levels.
When a pin represents
a bus or an array of pins, expand the pin entry in the list view to see
individual pin names.
You can view the properties of an instance or primitive with the Properties pane.
Figure 5. Properties PaneTo view the properties of an instance or primitive in the RTL Viewer or
Technology Map Viewer, right-click the node and click Properties.
The Properties pane contains tabs
with the following information about the selected node:
The Fan-in tab displays theInput port
and Fan-in Node.
The Fan-out tab displays theOutput port
and Fan-out Node.
The Parameters tab displays the Parameter
Name and Values of an instance.
The Ports tab displays the Port Name and Constant value (for example, VCC or GND). The following table lists the possible values of a port:
Table 4. Possible Port Values
The port is not connected and has VCC value (tied to VCC)
The port is not connected and has GND value (tied to
The port is connected and has value (other than VCC or GND)
The port is not connected and has no value (hanging)
If the selected node is an atom primitive, the Properties pane displays a schematic of the internal logic.
Netlist Viewers Find Pane
You can narrow the range of the search process by setting the
following options in the
Browse in the
Find pane to specify the hierarchy level of
the search. In the
Select Hierarchy Level dialog box, select the
particular instance you want to search.
Turn on the
Include subentities option to include child
hierarchies of the parent instance during the search.
Options to open the
Find Options dialog box. Turn on
Ports, or any combination of the three to
further refine the parameters of the search.
When you click the
List button, a progress bar appears below the
All results that match the criteria you set are listed in a table.
When you double‑click an item in the table, the related node is highlighted in
red in the schematic view.
The schematic view is shown on the right side of the RTL Viewer
and Technology Map Viewer. The schematic view contains a schematic representing
the design logic in the netlist. This view is the main screen for viewing your
gate‑level netlist in the RTL Viewer and your technology‑mapped netlist in the
Technology Map Viewer.
The RTL Viewer and Technology Map Viewer attempt to display schematic in
a single page view by default. If the schematic crosses over to several pages,
you can highlight a net and use connectors to trace the signal in a single
Display Schematics in Multiple Tabbed View
The RTL Viewer and Technology Map Viewer support multiple tabbed
With multiple tabbed view, schematics can be displayed in different
tabs. Selection is independent between tabbed views, but selection in the tab
in focus is synchronous with the Netlist Navigator pane.
To create a new blank tab, click the
New Tab button at the end of the tab row . You
can now drag a node from the
Netlist Navigator pane into the schematic view.
Right-click in a tab to see a shortcut menu to perform the following actions:
Create a blank view with
Duplicate Tab of the tab in focus
Choose to Tile Tabs
Close Tab to close the tab in focus
Close Other Tabs to close all tabs except the
tab in focus
The symbols for nodes in the schematic represent elements of your
design netlist. These elements include input and output ports, registers, logic gates,
Intel primitives, high-level operators, and
Note: The logic gates and operator
primitives appear only in the RTL Viewer. Logic in the Technology Map Viewer is represented
by atom primitives, such as registers and LCELLs.
Table 5. Symbols in the Schematic View This table lists and describes the primitives and basic symbols that
you can display in the schematic view of the RTL Viewer and Technology Map Viewer.
An input, output, or bidirectional port in
the current level of hierarchy. A device input, output, or bidirectional pin when
viewing the top‑level hierarchy. The symbol can also represent a bus. Only one
wire is shown connected to the bidirectional symbol, representing the input and
Input symbols appear on the left-most side of the schematic.
Output and bidirectional symbols appear on the right‑most side of the
An input or output connector, representing a net that comes
from another page of the same hierarchy. To go to the page that contains the source
or the destination, double-click the connector to jump to the appropriate
OR, AND, XOR Gates
An OR, AND, or XOR gate primitive (the number of ports can
vary). A small circle (bubble symbol) on an input or output port indicates the port
A multiplexer primitive with a selector port that selects
between port 0 and port 1. A multiplexer with more than two inputs is displayed as an
A buffer primitive. The figure shows the tri-state buffer, with
an inverted output enable port. Other buffers without an enable port include LCELL,
SOFT, and GLOBAL. The NOT gate and EXP expander
buffers use this symbol without an enable port and with an inverted output
A latch/DFF (data flipflop) primitive. A DFF has the same ports
as a latch and a clock trigger. The other flipflop primitives are similar:
DFFEA (data flipflop
with enable and asynchronous load) primitive with additional ALOAD asynchronous load and ADATA data signals
DFFEAS (data flipflop
with enable and synchronous and asynchronous load), which has ASDATA as the secondary data port
An atom primitive. The symbol displays the atom name, the port
names, and the atom type. The blue shading indicates an atom primitive for which you
can view the internal details.
Any primitive that does not fall into the previous categories.
Primitives are low-level nodes that cannot be expanded to any lower hierarchy. The
symbol displays the port names, the primitive or operator type, and its
An instance in the design that does not correspond to a
primitive or operator (a user‑defined hierarchy block). The symbol displays the port
name and the instance name.
A user-defined encrypted instance in the design. The symbol
displays the instance name. You cannot open the schematic for the lower-level
hierarchy, because the source design is encrypted.
A synchronous memory instance with registered inputs and
optionally registered outputs. The symbol shows the device family and the type of
memory block. This figure shows a true dual-port memory block in a Stratix M-RAM
A constant signal value that is highlighted in gray and
displayed in hexadecimal format by default throughout the schematic.
Table 6. Operator Symbols in the RTL Viewer Schematic View The following lists and describes the additional higher level
operator symbols in the RTL Viewer schematic view.
An adder operator:
OUT = A + B
A multiplier operator:
OUT = A ¥ B
A divider operator:
OUT = A / B
A left shift operator:
OUT = (A << COUNT)
A right shift operator:
OUT = (A >> COUNT)
A modulo operator:
OUT = (A%B)
A less than comparator:
OUT = (A<:B:A>B)
OUT = DATA [SEL]
The data range size is 2sel range
A multiplexer with one-hot select input and more than two input
To select an item in the schematic view, ensure that the Selection Tool is enabled in the Netlist Viewer toolbar. Click an item in the schematic view to highlight in red.
Select multiple items by pressing the Shift key while selecting with the mouse.
Items selected in the schematic view are automatically selected in the Netlist Navigator pane. The folder then expands automatically if it is required to show the selected entry; however, the folder does not collapse automatically when you deselected the entries.
When you select a hierarchy box, node, or port in the schematic view, the Schematic View highlights the item in red, but not the connecting nets. When you select a net (wire or bus) in the schematic view, the Schematic View highlights all connected nets in red.
Once you select an item, you can perform different actions on it based on the contents of the shortcut menu which appears when you right-click your selection.
When you right-click a selected instance or primitive in the schematic view, the Netlist Viewer displays a shortcut menu.
If the selected item is a node, you see the following options:
Click Expand to Upper Hierarchy to displays the parent hierarchy of the node in
Click Copy ToolTip to copy the selected item name to the clipboard. This command
does not work on nets.
Click Hide Selection to remove the selected item from the
schematic view. This command does not delete the item from the design, merely masks it in
the current view.
Click Filtering to display a sub-menu with options for filtering
Filtering in the Schematic View
Filtering allows you to filter out nodes and nets in a netlist to view only the logic elements of interest to you.
You can filter a netlist by selecting hierarchy boxes, nodes, or ports of a node, that are part of the path you want to see. The following filter commands are available:
Sources—displays the sources of the selection.
Destinations—displays the destinations of the selection.
Sources & Destinations—displays the sources and destinations of the selection.
Selected Nodes—displays only the selected nodes.
Between Selected Nodes—displays nodes and connections in the path between the selected nodes.
Bus Index—Displays the sources or destinations for one or more indexes of an output or input bus port.
Filtering Options—Displays the Filtering Options dialog box:
Stop filtering at register—Turning on this option directs the Netlist Viewer to filter out to the nearest register boundary.
Filter across hierarchies—Turning on this option directs the Netlist Viewer to filter across hierarchies.
Maximum number of hierarchy levels—Sets the maximum number of hierarchy levels that the schematic view can display.
To filter a netlist, select a hierarchy box, node, port, net, or state node, right-click in the window, point to Filter and click the appropriate filter command. The Netlist Viewer generates a new page showing the netlist that remains after filtering.
View Contents of Nodes in the Schematic View
In the RTL Viewer and the Technology Map Viewer, you can view
the contents of nodes to see their underlying implementation details.
You can view LUTs, registers, and logic gates. You can also view the implementation of RAM and DSP blocks in certain devices in the RTL
Viewer or Technology Map Viewer. In the Technology Map Viewer, you can view the contents of primitives to see their underlying implementation details.
Figure 6. Wrapping and Unwrapping ObjectsIf you can unwrap the contents of an instance, a plus symbol appears in the upper right corner of the object in the schematic view.
To wrap the contents (and revert to the compact format), click the minus symbol in the upper right corner of the unwrapped instance.
Note: In the schematic view, the internal details in an atom instance cannot be selected as
individual nodes. Any mouse action on any of the internal details is treated as a mouse action on the atom instance.
Figure 7. Nodes with Connections Outside the HierarchyIn some cases, the selected instance connects to something outside the
visible level of the hierarchy in the schematic view. In this case, the net appears as a
dotted line. Double-click the dotted line to expand the view to display the destination of
the connection .
Figure 8. Display Nets Across HierarchiesIn cases where the net connects to an instance outside the hierarchy, you can select the net, and unwrap the node to see the
Figure 9. Show Connectivity DetailsYou can select a bus port or bus pin and click Connectivity Details in the context menu for
You can double-click objects in the Connectivity
Details window to navigate to them quickly. If the plus symbol appears, you
can further unwrap objects in the view. This can be very useful when tracing a signal in a
Moving Nodes in the Schematic View
Rearrange items in the schematic view by dragging to destination.
To move a node from one area of the netlist to another, select the node and hold down the Shift key. Legal placements appear as shaded areas within the hierarchy. Click to drop the selected node.
Figure 10. Legal Placement when Moving Nodes
To restore the schematic view to its default arrangement, right-click and click Refresh.
View LUT Representations in the Technology Map Viewer
You can view different representations of a LUT by
right-clicking the selected LUT and clicking
You can view the LUT representations in the following three tabs in
Properties dialog box:
Schematic tab—the equivalent gate
representations of the LUT.
Truth Table tab—the truth table
Use the Zoom Tool in the toolbar, or mouse gestures, to control the magnification of your schematic on the View menu.
By default, the Netlist Viewer displays most pages sized to fit in the
window. If the schematic page is very large, the schematic is displayed at the
minimum zoom level, and the view is centered on the first node. Click
Zoom In to view the image at a larger size, and
Zoom Out to view the image (when the entire
image is not displayed) at a smaller size. The
Zoom command allows you to specify a
magnification percentage (100% is considered the normal size for the schematic
You can use the Zoom Tool on the Netlist Viewer toolbar to control magnification in the schematic view. When you select the Zoom Tool
in the toolbar, clicking in the schematic zooms in and centers the view on the location you clicked. Right‑click in the schematic to zoom out and center
the view on the location you clicked. When you select the Zoom Tool, you can also zoom into a certain portion of the schematic by selecting a
rectangular box area with your mouse cursor. The schematic is enlarged to show the selected area.
Within the schematic view, you can also use the following mouse
gestures to zoom in on a specific section:
in—Dragging a box around an area starting in the upper-left and
dragging to the lower right zooms in on that area.
-0.5—Dragging a line from lower-left to upper-right zooms out 0.5
levels of magnification.
0.5—Dragging a line from lower-right to upper-left zooms in 0.5
levels of magnification.
fit—Dragging a line from upper-right to lower-left fits the
schematic view in the page.
To open the Bird’s Eye View, on the View menu, click Bird’s Eye View, or click the Bird’s Eye View icon in the toolbar.
Viewing the entire schematic can be useful when debugging and tracing
through a large netlist. The
Quartus® Prime software allows you to quickly navigate
to a specific section of the schematic using the Bird’s Eye View feature, which
is available in the RTL Viewer and Technology Map Viewer.
The Bird’s Eye View shows the current area of interest:
Select an area by clicking
and dragging the indicator or right-clicking to form a rectangular box around
Click and drag the
rectangular box to move around the schematic.
Resize the rectangular box
to zoom-in or zoom-out in the schematic view.
Partition the Schematic into Pages
For large design hierarchies, the RTL Viewer and Technology Map
Viewer partition your netlist into multiple pages in the schematic view.
When a hierarchy level is partitioned into multiple pages, the title
bar for the schematic window indicates which page is displayed and how many
total pages exist for this level of hierarchy. The schematic view displays this
<current page number>
of <total number of
Input and output connector symbols indicate nodes that connect
across pages of the same hierarchy. Double‑click a connector to trace the net
to the next page of the hierarchy.
Note: After you double-click to follow a connector port, the Netlist Viewer opens a new page, which centers the view on the particular source
or destination net using the same zoom factor as the previous page. To trace a specific net to the new page of the hierarchy, Intel recommends that you
first select the necessary net, which highlights it in red, before you double‑click to navigate across pages.
Cross-Probing to a Source Design File and Other Intel Quartus Prime Windows
The RTL Viewer and Technology Map Viewer allow you to cross‑probe to the source design file and to various
other windows in the
Quartus® Prime software.
You can select one or more hierarchy boxes, nodes, state nodes, or
state transition arcs that interest you in the Netlist Viewer and locate the
corresponding items in another applicable
Quartus® Prime software window. You can
then view and make changes or assignments in the appropriate editor or
To locate an item from the Netlist Viewer in another window, right-click
the items of interest in the schematic or state diagram, point to
Locate, and click the appropriate command. The
following commands are available:
Locate in Assignment Editor
Locate in Pin Planner
Locate in Chip Planner
Locate in Resource Property Editor
Locate in Technology Map Viewer
Locate in RTL Viewer
Locate in Design File
The options available for locating an item depend on the type of node
and whether it exists after placement and routing. If a command is enabled in
the menu, it is available for the selected node. You can use the
Locate in Assignment Editor command for all nodes,
but assignments might be ignored during placement and routing if they are
applied to nodes that do not exist after synthesis.
The Netlist Viewer automatically opens another window for the
appropriate editor or floorplan and highlights the selected node or net in the
newly opened window. You can switch back to the Netlist Viewer by selecting it
in the Window menu or by closing, minimizing, or moving the new window.
Cross-Probing to the Netlist Viewers from Other Intel Quartus Prime Windows
You can cross-probe to the RTL Viewer and Technology Map Viewer
from other windows in the
Quartus® Prime software. You can select one or more nodes
or nets in another window and locate them in one of the Netlist Viewers.
You can locate nodes between the RTL Viewer and Technology Map Viewer, and you can locate nodes in the RTL Viewer and
Technology Map Viewer from the following
Quartus® Prime software windows:
Timing Closure Floorplan
Resource Property Editor
Timing Analyzer (supports the Technology Map Viewer only)
To locate elements in the Netlist Viewer from another
Quartus® Prime window, select the node or nodes in
the appropriate window; for example, select an entity in the Entity list on the Hierarchy tab in the Project Navigator, or select nodes in the Timing Closure Floorplan, or select node names in the From or To column in the Assignment Editor. Next, right-click the selected object,
point to Locate, and click Locate in RTL Viewer or Locate in Technology Map Viewer. After you click this command, the Netlist Viewer opens, or is brought to the foreground
if the Netlist Viewer is open.
Note: The first time the window opens after a compilation, the preprocessor stage runs before the
Netlist Viewer opens.
The Netlist Viewer shows the selected nodes and, if applicable, the connections between the nodes. The display is similar to what you
see if you right‑click the object, then click Filter > Selected Nodes using Filter across hierarchy. If the nodes cannot be found in the Netlist Viewer, a message box
displays the message: Can’t find requested location.
Viewing a Timing Path
After completing a full design compilation,
including the timing analyzer stage, you can see a visual representation of a timing
path cross-probe from a timing report.
For details about generating the
timing report, refer to the
Quartus® Prime Pro Edition User Guide: Timing Analyzer
When you locate the timing path from the Timing Analyzer to the
Technology Map Viewer, the interconnect and cell delay associated with each node
appears on top of the schematic symbols. The total slack of the selected timing path
appears in the Page Title section of the schematic.
To open the report from the Compilation
Report Table of Contents, click Timing Analyzer GUI > Report Timing, and double-click the timing corner.
To open the report from the Timing
Analyzer, open the Report Timing folder
in the Report pane, and double-click the
In the Summary of Paths tab, right-click a row in the table and select Locate Path > Locate in Technology Map Viewer. In the Technology Map Viewer, the schematic page displays the nodes along the timing path with a summary of the total delay.
Added “Enable or
Disable the Auto Hierarchy List” on page 13–15
Command” on page 13–44
Changed page size to 8.5” × 11”
Added Arria GX
about the radial menu feature
about probing from schematic to Signal Tap
Added .png and .gif
to the list of supported image file formats
figures and tables
Added new sections
“Enabling and Disabling the Radial Menu”, “Changing the Time Interval”,
“Changing the Constant Signal Value Formatting”, “Logic Clouds in the RTL
Viewer”, “Logic Clouds in the Technology Map Viewer”, “Manually Group and
Ungroup Logic Clouds”, “Customizing the Shortcut Commands”
“Customizing the Radial Menu”
“Grouping Combinational Logic into Logic Clouds”
content based on the
Quartus® Prime software
Quartus® Prime software offers netlist
and physical synthesis optimizations that improve performance of your design. Click to
enable physical synthesis options during fitting.
This chapter also provides
guidelines for applying netlist and physical synthesis options, and for preserving
compilation results through back-annotation.
Table 7. Netlist Optimization and Physical Synthesis Options
Enable physical synthesis options.
Assignments > Settings > Compiler Settings > Advanced Settings (Fitter). Physical synthesis optimizations apply at different
stages of the compilation flow, either during synthesis, fitting, or
Enable netlist optimization options.
Assignments > Settings > Compiler Settings > Advanced Settings (Synthesis). Netlist optimizations operate with the atom netlist of
your design, which describes a design in terms of specific primitives.
An atom netlist file can be an Electronic Design Interchange Format
(.edf) file generated by a third-party
Quartus® Prime synthesis generates and
internally uses the atom netlist internally
Note: Because the node
names for primitives in the design can change when you use physical synthesis
optimizations, you should evaluate whether your design depends on fixed node names. If
you use a verification flow that might require fixed node names, such as the Signal Tap Logic Analyzer, formal verification, or the Logic Lock based optimization flow (for legacy devices), disable physical
Physical Synthesis Optimizations
Quartus® Prime Fitter places and routes the logic cells to
ensure critical portions of logic are close together and use the fastest possible
routing resources. However, routing delays are often a significant part of the typical
critical path delay.
Physical synthesis optimizations take into
consideration placement information, routing delays, and timing information to determine the
optimal placement. The Fitter then focuses timing-driven optimizations at those critical
parts of the design. The tight integration of the synthesis and fitting processes is known
as physical synthesis.
The following sections describe the physical synthesis optimizations available in the
Quartus® Prime software, and how they can help improve performance and
fitting for the selected device.
Physical synthesis optimization improves circuit performance by
performing combinational and sequential optimization and register
To enable physical synthesis options:
Click Assignments > Settings > Compiler Settings.
To enable retiming, combinational optimization, and register
duplication, click Advanced Settings
(Fitter). Next, enable Physical
View physical synthesis results in the Netlist Optimizations report.
Physical Synthesis Options
Quartus® Prime software provides physical synthesis optimization options to improve fitting results.
To access these options, click Assignments > Settings > Compiler Settings > Advanced Settings (Fitter).
Note: To disable global physical synthesis optimizations for specific elements of your design, assign the Netlist Optimizations logic option to Never Allow to the specific nodes or entities.
Table 8. Physical Synthesis Options
Advanced Physical Synthesis
Uses the physical synthesis engine to perform combinational and sequential optimization during fitting to improve circuit performance.
You can use the Assignment Editor to apply the Netlist Optimizations logic option. Use this option to disable physical synthesis optimizations for parts of your design.
Allow Register Duplication
Allows the Compiler to duplicate registers to improve design performance. When you enable this option, the Compiler copies registers and moves some fan-out to this new node. This optimization improves routability and can reduce the total routing wire in nets with many fan-outs.
If you disable this option, this disables optimizations that retime registers.
This setting affects Analysis & Synthesis and the Fitter.
Allow Register Merging
Allows the Compiler to remove registers that are identical to other registers in the design. When you enable this option, in cases where two registers generate the same logic, the Compiler deletes one register, and the remaining registers fan-out to the deleted register's destinations. This option is useful if you want to prevent the Compiler from removing intentional use of duplicate registers.
If you disable register merging, the Compiler disables optimizations that retime registers.
This setting affects Analysis & Synthesis and the Fitter.
Applying Netlist Optimizations
The improvement in performance when
using netlist optimizations is design dependent. If you have restructured your
design to balance critical path delays, netlist optimizations might yield
minimal improvement in performance.
You may have to experiment with available
options to see which combination of settings works best for a particular
design. Refer to the messages in the compilation report to see the magnitude of
improvement with each option, and to help you decide whether you should turn on
a given option or specific effort level.
Turning on more netlist optimization options
can result in more changes to the node names in the design; bear this in mind
if you are using a verification flow, such as the Signal Tap Logic Analyzer
or formal verification that requires fixed or known node names.
To find the best results, you can use the
Quartus® Prime Design Space
Explorer II (DSE) to apply various sets of netlist optimization options.
For designs synthesized with a third-party tool, the Perform WYSIWYG primitive resynthesis option allows you to apply optimizations to the synthesized netlist.
The Perform WYSIWYG primitive resynthesis option directs the
Quartus® Prime software to un-map the logic elements (LEs) in an atom netlist to logic gates, and then re-map the gates back to Intel-specific primitives. Third-party synthesis tools generate either an .edf or .vqm atom netlist file using Intel-specific primitives. When you turn on the Perform WYSIWYG primitive resynthesis option, the
Quartus® Prime software uses device-specific techniques during the re-mapping process. This feature re-maps the design using the Optimization Technique specified for your project (Speed, Area, or Balanced).
The Perform WYSIWYG primitive resynthesis
option unmaps and remaps only logic cells, also referred to as LCELL or LE
primitives, and regular I/O primitives (which may contain registers). Double data
rate (DDR) I/O primitives, memory primitives, digital signal processing (DSP)
primitives, and logic cells in carry chains are
not remapped. This process does not process logic specified in an encrypted
.vqm file or an
.edf file, such as third-party intellectual
The Perform WYSIWYG primitive resynthesisoption can change node names in the .vqm file or .edf file from your third-party synthesis tool, because the
primitives in the atom netlist are broken apart and then re-mapped by the
Quartus® Prime software. The re-mapping process removes
duplicate registers. Registers that are not removed retain the same name after
Any nodes or entities that have the
logic option set to
Never Allow are not
affected during WYSIWYG primitive resynthesis. You can use the Assignment
Editor to apply the
logic option. This option disables WYSIWYG resynthesis for parts of your
Note: Primitive node names are specified during synthesis. When netlist optimizations are
applied, node names might change because primitives are created and removed. HDL
attributes applied to preserve logic in third-party synthesis tools cannot be
maintained because those attributes are not written into the atom netlist, which the
Quartus® Prime software reads.
If you use the
Quartus® Prime software to synthesize your design, you can use the
(preserve) and Keep Combinational Logic (keep) attributes to maintain certain
nodes in the design.
Quartus® Prime Flow for WYSIWYG Primitive
You can run procedures and make settings described in this chapter in a Tcl script. You can also run
some procedures at a command prompt. For detailed information about scripting command options, refer to the
Quartus® Prime Command-Line and Tcl API Help browser. To run the Help browser, type the following command at the command prompt:
You can specify many of the options described in this section on either an instance or global level, or both.
Use the following Tcl command to make a global assignment:
The project .qsf file
preserves the settings that you specify in the GUI. Alternatively, you can edit the
.qsf directly. The .qsf file supports the
following synthesis netlist optimization commands. The Type column indicates whether the setting is
supported as a global setting, an instance setting, or both.
Table 9. Synthesis Netlist Optimizations and
The project .qsf file
preserves the settings that you specify in the GUI. Alternatively, you can edit the
.qsf directly. The .qsf file supports the
following synthesis netlist optimization commands. The Type column indicates whether the setting is
supported as a global setting, an instance setting, or both.
Table 10. Physical Synthesis Optimizations and
Quartus® Prime Settings
File Variable Name
Advanced Physical Synthesis
Netlist Optimizations and Physical Synthesis Revision History
The following revision history applies to this
Quartus® Prime Version
Removed reference to unsupported CASCADE buffer from
"Optimize IOC Register Placement for Timing Logic Option"
Isolating a Partition
Removed reference to .vqm
Added topic: Isolating
a Partition Netlist.
Updated physical synthesis options and
Removed information about
deprecated physical synthesis options.
Changed instances of Quartus II to
Updated location of Fitter Settings,
Analysis & Synthesis Settings, and Physical Synthesis
Optimizations Settings to Compiler Settings.
Updated DSE II content.
Removed HardCopy device
Removed survey link.
Added links to
Help in several sections.
Removed Referenced Documents section.
Reformatted Document Revision History
Added information to “Physical Synthesis for
Added information to “Applying Netlist Optimization
Made minor editorial updates
chapter 11 in the 8.1.0 release.
Updated the “Physical Synthesis for Registers—Register
Retiming” and “Physical Synthesis Options for Fitting”
This chapter describes techniques to reduce resource usage when designing for
Resource Utilization Information
Determining device utilization provides useful information regardless of whether the design achieved a successful fit. If the compilation results in a no-fit error, resource utilization information helps to analyze the fitting problems in the design. If the fitting is successful, this information allows you to determine if design changes introduce fitting difficulties. Additionally, you can determine the impact of the resource utilization in the timing performance.
The Compilation Report provides information
about resource usage.
The Flow Summary section of the compilation report
indicates whether the design exceeds the available device resources, and reports resource
utilization, including pins, memory bits, digital signal processing (DSP) blocks, and
phase-locked loops (PLLs).
Figure 12. Flow Summary Report
The Fitter can spread logic throughout the device,
which may lead to higher overall utilization.
As the device fills up, the Fitter automatically
searches for logic functions with common inputs to place in one ALM. The number of packed
registers also increases. Therefore, a design that has high overall utilization might still
have space for extra logic if the logic and registers can be packed together more tightly.
In those cases, you can benefit by a report that provides more details.
In the Fitter section of the compilation report, reports under Resource Section provide detailed resource information.
The Fitter Resource Usage Summary report breaks down the logic utilization information and provides additional resource information, including the number of bits in each type of memory block. This panel also contains a summary of the usage of global clocks, PLLs, DSP blocks, and other device-specific resources.
For designs synthesized with the
Quartus® Prime synthesis engine, you can see reports describing optimizations that occurred during compilation.
For example, in the Analysis & Synthesis section, Optimization Results folder, you can find a list of registers
removed during synthesis. With this report you can estimate resource utilization for partial
designs so you make sure that registers were not removed due to missing connections with
other parts of the design.
If the reports show routing resource usage lower than 100% but the design does not fit, either routing resources are insufficient or the design contains invalid assignments. In either case, the Compiler generates a message in the Processing tab of the Messages window describing the problem.
If the Fitter finishes unsuccessfully and runs much faster than on similar designs, a resource might be over-utilized or there might be an illegal assignment.
Quartus® Prime software takes too long to run when compared to similar designs possibly the Compiler is not able to find valid placement or route. In the Compilation Report, look for errors and warnings that indicate these types of problems.
The Chip Planner can help you find areas of the device that have routing congestion for specific types of routing resources. If you find areas with very high congestion, analyze the cause of the congestion. Issues such as high fan-out nets not using global resources, an improperly chosen optimization goal (speed versus area), very restrictive floorplan assignments, or the coding style can cause routing congestion. After you identify the cause, modify the source or settings to reduce routing congestion.
Resource utilization issues can be divided into three categories:
Issues relating to I/O pin utilization or
placement, including dedicated I/O blocks such as PLLs or LVDS
Issues relating to logic utilization or
placement, including logic cells containing registers and LUTs as well
as dedicated logic, such as memory blocks and DSP blocks.
Issues relating to routing.
I/O Pin Utilization or Placement
Resolve I/O resource problems with
Guideline: Modify Pin Assignments or Choose a Larger Package
If a design that has pin assignments
fails to fit, compile the design without the pin assignments to determine whether a fit
is possible for the design in the specified device and package.
You can use
this approach if an
Quartus® Prime error message indicates
fitting problems due to pin assignments.
If the design fits when all pin assignments
are ignored or when several pin assignments are ignored or moved, you might have to
modify the pin assignments for the design or select a larger package.
If the design fails to fit because
insufficient I/Os pins are available, a larger device package (which can be the same
device density) that has more available user I/O pins can result in a successful
Resolve logic resource problems,
including logic cells containing registers and LUTs, as well as dedicated logic
such as memory blocks and DSP blocks, with these guidelines.
Guideline: Optimize Source Code
If your design does not fit because of
logic utilization, then evaluate and modify the design at the source.
can often improve logic significantly by making design-specific changes to your source code.
This is typically the most effective technique for improving the quality of your results.
If your design does not fit into available
logic elements (LEs) or ALMs, but you have unused memory or DSP blocks, check if you
have code blocks in your design that describe memory or DSP functions that are not
being inferred and placed in dedicated logic. You might be able to modify your
source code to allow these functions to be placed into dedicated memory or DSP
resources in the target device.
Ensure that your state machines are
recognized as state machine logic and optimized appropriately in your synthesis
tool. State machines that are recognized are generally optimized better than if the
synthesis tool treats them as generic logic. In the
Quartus® Prime software, you can check for the State Machine report under
Synthesis in the Compilation Report. This report provides details,
including the state encoding for each state machine that was recognized during
compilation. If your state machine is not being recognized, you might have to change
your source code to enable it to be recognized.
If the Fitter cannot resolve a design due to limitations in logic resources, resynthesize the design to improve the area utilization.
First, ensure that the device and timing constraints are set correctly in the synthesis tool. Particularly when area utilization of the design is a concern, ensure that you do not over-constrain the timing requirements for the design. Synthesis tools try to meet the specified requirements, which can result in higher device resource usage if the constraints are too aggressive.
If resource utilization is an important concern,
optimize for area instead of speed.
If you are using
synthesis, click Assignments > Settings > Compiler Settings > Advanced Settings (Synthesis) and select Balanced or Area for the Optimization
If you want to reduce area for specific modules in the design using the Area or Speed setting while leaving the default Optimization Technique setting at Balanced, use the Assignment Editor.
You can also turn on the Speed Optimization Technique for Clock Domains logic option to optimize for speed all combinational logic in or between the specified clock domains.
In some synthesis tools, not specifying an fMAX requirement can result in less resource utilization.
Optimizing for area or speed can affect the register-to-register timing
Note: In the
Quartus® Prime software, the Balanced setting typically produces utilization results that are
very similar to those produced by the Area setting, with better performance results. The Area setting can give better results in
Quartus® Prime software provides additional attributes and options that can help improve the quality of the synthesis results.
Guideline: Perform WYSIWYG Primitive Resynthesis with Balanced or Area Setting
Perform WYSIWYG Primitive
Resynthesis logic option specifies whether to perform WYSIWYG
primitive resynthesis during synthesis. This option uses the setting specified
logic option. The
Perform WYSIWYG Primitive
Resynthesis logic option is useful for resynthesizing some or all
of the WYSIWYG primitives in your design for better area or performance.
However, WYSIWYG primitive resynthesis can be done only when you use
third-party synthesis tools.
typically produces utilization results that are very similar to the
Area setting with
better performance results. The
Area setting can give
better results in some cases. Performing WYSIWYG resynthesis for area in this
way typically reduces register-to-register timing performance.
Auto Packed Registers
option implements the functions of two cells into one logic cell by combining
the register of one cell in which only the register is used with the LUT of
another cell in which only the LUT is used.
Synthesis tools typically provide the
option of preserving hierarchical boundaries, which can be useful for
verification or other purposes. However, the
Quartus® Prime software optimizes
across hierarchical boundaries so as to perform the most logic minimization,
which can reduce area in a design with no design partitions.
Guideline: Re-target Memory Blocks
If the Fitter cannot resolve a design due to memory resource limitations, the design may require a type of memory that the device does not have.
For memory blocks created with the Parameter Editor, edit the RAM block type to target a new memory block size.
The Compiler can also infer ROM and RAM memory blocks from the HDL code, and the synthesis engine can place large shift registers into memory blocks by inferring the Shift register (RAM-based) IP core. When you turn off this inference in the synthesis tool, the synthesis engine places the memory or shift registers in logic instead of memory blocks. Also, turning off this inference prevents registers from being moved into RAM, improving timing performance,
Depending on the synthesis tool, you can also set the RAM block type for inferred memory blocks. In
Quartus® Prime synthesis, set the ramstyle attribute to the desired memory type for the inferred RAM blocks. Alternatively, set the option to logic to implement the memory block in standard logic instead of a memory block.
Consider the Resource Utilization by Entity report in the report file and determine whether there is an unusually high register count in any of the modules. Some coding styles prevent the
Quartus® Prime software from inferring RAM blocks from the source code because of the blocks’ architectural implementation, forcing the software to implement the logic in flip-flops. For example, an asynchronous reset on a register bank might make the register bank incompatible with the RAM blocks in the device architecture, so Compiler implements the register bank in flip-flops. It is often possible to move a large register bank into RAM by slight modification of associated logic.
Guideline: Use Physical Synthesis Options to Reduce Area
The physical synthesis options available
at Assignments > Settings > Compiler Settings > Advanced Settings (Fitter) help you decrease resource usage. When you enable physical synthesis, the
Quartus® Prime software makes placement-specific
changes to the netlist that reduce resource utilization for a specific Intel device.
Note: Physical synthesis increases compilation time. To reduce the impact on
compilation time, you can apply physical synthesis options to specific
A design might not fit because it
requires too many DSP blocks. You can implement all DSP block functions with logic
cells, so you can retarget some of the DSP blocks to logic to obtain a fit.
If the DSP function was created with the
parameter editor, open the parameter editor and edit the function so it targets
logic cells instead of DSP blocks. The
software uses the DEDICATED_MULTIPLIER_CIRCUITRY IP core parameter to control the
DSP blocks also can be inferred from your
HDL code for multipliers, multiply-adders, and multiply-accumulators. You can turn
off this inference in your synthesis tool. When you are using
synthesis, you can disable
inference by turning off the Auto DSP Block Replacement logic option for your entire project.
Click Assignments > Settings > Compiler Settings > Advanced Settings (Synthesis). Turn off Auto
DSP Block Replacement. Alternatively, you can disable the option for
a specific block with the Assignment Editor.
Quartus® Prime software also offers the DSP Block Balancing logic option, which
implements DSP block elements in logic cells or in different DSP block modes. The
setting allows DSP block balancing to convert the DSP block slices automatically as
appropriate to minimize the area and maximize the speed of the design. You can use
other settings for a specific node or entity, or on a project-wide basis, to control
Quartus® Prime software converts DSP functions
into logic cells and DSP blocks. Using any value other than Auto or Off overrides the DEDICATED_MULTIPLIER_CIRCUITRY parameter used in IP core variations.
Resolve routing resource problems with
Guideline: Set Auto Packed Registers to Sparse or Sparse Auto
The Auto Packed Registers
option reduces LE or ALM count in a design. You can set this option by clicking
Assignment > Settings > Compiler Settings > Advanced Settings (Fitter).
Guideline: Set Fitter Aggressive Routability Optimizations to Always
Fitter Aggressive Routability
Optimization option is useful if your design does not fit due to
excessive routing wire utilization.
If there is a significant imbalance between
placement and routing time (during the first fitting attempt), it might be
because of high wire utilization. Turning on the
Fitter Aggressive Routability
Optimizations option can reduce your compilation time.
On average, this option can save up to 6%
wire utilization, but can also reduce performance by up to 4%, depending on the
The Router Effort Multiplier controls how quickly the router tries to find a valid solution.
The default value is 1.0 and legal values must be greater than 0.
Numbers higher than 1 help designs that are difficult to route by increasing
the routing effort.
Numbers closer to 0 (for example, 0.1) can reduce router runtime, but
usually reduce routing quality slightly.
Experimental evidence shows that a multiplier of 3.0 reduces overall wire usage
by approximately 2%. Using a Router Effort Multiplier higher than the default value
can benefit designs with complex datapaths with more than five levels of logic.
However, congestion in a design is primarily due to placement, and increasing the
Router Effort Multiplier does not necessarily reduce congestion.
Note: Any Router Effort Multiplier value greater than 4 only increases by 10% for every
additional 1. For example, a value of 10 is actually 4.6.
Guideline: Remove Fitter Constraints
A design with conflicting constraints or
constraints that are difficult to meet may not fit in the targeted device.
For example, a design might fail to fit if the location or Logic Lock assignments are too strict and not enough routing
resources are available on the device.
To resolve routing congestion caused by
restrictive location constraints or Logic Lock
region assignments, use the Routing Congestion task in the Chip Planner to locate routing
problems in the floorplan, then remove any internal location or Logic Lock region assignments in that area. If your
design still does not fit, the design is over-constrained. To correct the problem,
remove all location and Logic Lock assignments
and run successive compilations, incrementally constraining the design before each
compilation. You can delete specific location assignments in the Assignment Editor
or the Chip Planner. To remove Logic Lock
assignments in the Chip Planner, in the Logic Lock Regions Window, or on the Assignments menu, click Remove Assignments. Turn on the assignment
categories you want to remove from the design in the Available assignment categories list.
Guideline: Optimize Synthesis for Area, Not Speed
In some cases, resynthesizing the design to improve the area utilization can also improve the routability of the design.
First, ensure that you have set your device and timing constraints correctly in your synthesis tool. Ensure that you do not over constrain the timing requirements for the design, particularly when the area utilization of the design is a concern. Synthesis tools generally try to meet the specified requirements, which can result in higher device resource usage if the constraints are too aggressive.
If resource utilization is an important
optimize for area instead of speed.
If you are using
Assignments > Settings > Compiler Settings > Advanced Settings (Synthesis) and select Balanced or Area for the Optimization
If you want to reduce area for specific modules in your design
using the Area
setting while leaving the default Optimization Technique setting at
use the Assignment Editor.
You can also use the Speed Optimization Technique for Clock
Domains logic option to specify that all combinational logic in
or between the specified clock domain(s) is optimized for speed.
In some synthesis tools, not specifying an fMAX requirement can result in lower resource utilization.
Optimizing for area or speed can affect the register-to-register
Note: In the
Quartus® Prime software, the
setting typically produces utilization results that are very similar to those
produced by the Area setting, with better performance results. The Area setting can give
better results in some cases.
Quartus® Prime software provides additional attributes and options that can
help improve the quality of your synthesis results.
If your design does not fit because of routing problems and the methods described in the preceding sections do not sufficiently improve the routability of the design, modify the design at the source to achieve the desired results. You can often improve results significantly by making design-specific changes to your source code, such as duplicating logic or changing the connections between blocks that require significant routing resources.
Guideline: Use a Larger Device
If a successful fit cannot be achieved because of a shortage of routing resources, you might require a larger device.
You can run procedures and assign settings described in this chapter in a Tcl script. You can also run procedures at a command prompt.
For detailed information about scripting command options, refer to the
Quartus® Prime command-line and Tcl API Help browser.
To run the Help browser, type the following command at the command prompt:
You can specify many of the options described in this
section either in an instance, or at a global level, or both.
Use the following Tcl command to make a global assignment:
Quartus® Prime Settings File
(.qsf) variable name in the Tcl
assignment to make the setting along with the appropriate value. The
Type column indicates
whether the setting is supported as a global setting, an instance setting, or
Table 11. Advanced Compilation Settings
.qsf File Variable
Resource Utilization Optimization Techniques
This table lists QSF assignments and applicable values for Resource Utilization Optimization settings:
describes techniques to improve timing performance when designing for Intel devices.
The application techniques vary between designs.
Applying each technique does not always improve results.
Default settings and options in the
Quartus® Prime software provide the best trade-off between compilation
time, resource utilization, and timing performance. You can adjust these settings to determine
whether other settings provide better results for your design.
Note: Some techniques are
Optimize Multi Corner Timing
Process variations and changes in operating conditions can result in path delays that are significantly smaller than those in the slow corner timing model. As a consequence, the design can present hold time violations on those paths, and in rare cases, additional setup time violations.
In addition, designs targeting newer device families (with smaller process geometry) do not always present the slowest circuit performance at the highest operating temperature. The temperature at which the circuit is slowest depends on the selected device, the design, and the compilation results. The
Quartus® Prime software manages this new dependency by providing newer device families with three different timing corners—Slow 85°C corner, Slow 0°C corner, and Fast 0°C corner. For other device families, two timing corners are available—Fast 0°C and Slow 85°C corner.
The Optimize multi-corner timing option directs the Fitter to meet timing requirements at all process corners and operating conditions. The resulting design implementation is more robust across process, temperature, and voltage variations. This option is on by default, and increases compilation time by approximately 10%.
When this option is off, the Fitter optimizes designs considering only slow-corner delays from the slow-corner timing model (slowest manufactured device for a given speed grade, operating in low-voltage conditions).
Critical paths are timing paths in your
design that have a negative slack.
These timing paths can span from device I/Os to
internal registers, registers to registers, or from registers to device I/Os.
The slack of a path determines its criticality; slack appears in the timing analysis report,
which you can generate using the Timing Analyzer.
Design analysis for timing closure is a fundamental requirement for optimal
performance in highly complex designs. The analytical capability of the Chip Planner helps you
close timing on complex designs.
Viewing critical paths in the Chip Planner
shows why a specific path is failing.
can see if any modification in the placement can reduce the negative
display paths in the floorplan, perform a timing analysis and display results on the
Stratix® 10 device family uses the Hyper-Aware design flow to shorten design cycles and optimize
The Hyper-Aware design flow maximizes use of Hyper-Registers by
combining automated register retiming with implementation of targeted timing closure
recommendations (Fast Forward compilation). This sum of techniques drive the highest performance
Stratix® 10 designs.
A critical chain reports the design paths that limit further register
retiming optimization. The
Quartus® Prime Pro Edition software
provides the Hyper-Retimer critical chain reports to help you improve design performance. You
can focus on higher level optimization, because the Hyper-Retimer uses Hyper-Registers to
evenly balance slacks on all the registers in a critical chain.
For more information about improving design performance using the
Hyper-Retimer critical chain reports, refer to the Interpreting Critical
Chain Reports topic in the
Stratix® 10 High-Performance Design Handbook.
Looking at the critical chain shows the exact
logic that limits retiming operations in your design.
For example, you can
see if the retiming is limited by your RTL code, or by the constraints you applied on the
Quartus® Prime Pro Edition reports one critical chain per
clock domain and clock domain crossing.
The critical chain is available
at two different stages in the Hyper Aware Design Flow:
In the Retiming Limit Details Report:
This report is associated with the retiming stage in the Hyper
Aware Design Flow, and is enabled by default.
In the Fast Forward Compilation Report:
The Fast Forward Compilation stage is optional, and disabled by default.
You enable this stage from the Compilation Dashboard. Alternatively, start the
task directly by clicking the Fast Forward Timing
Closure Recommendations in the Compilations tasks.
You can also graphically visualize the critical chains in the
Technology Map Viewer. For more details, refer to Locate
Critical Chains in the
Stratix® 10 High-Performance Design Handbook
guidelines in this section when you encounter timing failures in a design.
guidelines show you how to evaluate compilation results of a design and how to address
problems. While the guideline does not cover specific examples of restructuring RTL to improve
design speed, the analysis techniques help you to evaluate changes to RTL that can help you to
Review Compilation Results
After compiling your design, review
the messages in each section of the compilation report.
Most designs that fail timing start out with other problems that the Fitter reports as warning
messages during compilation. Determine what causes a warning message, and whether to fix
or ignore the warning.
After reviewing the warning messages, review the informational
messages. Take note of anything unexpected, for example, unconnected ports,
ignored constraints, missing files, and assumptions or optimizations that the
Evaluate Fitter Netlist Optimizations
The Fitter can also perform optimizations
to the design netlist.
Major changes include register packing, duplicating or
inverting signals, or modifying nodes in a general way such as moving an input from one logic
cell to another. Find and review these reports in the Netlist Optimizations results of the
Evaluate Optimization Results
After checking what optimizations were
done and how they improved performance, evaluate the runtime it took to get the
extra performance. To reduce compilation time, review the physical synthesis
and netlist optimizations over a couple of compilations, and edit the RTL to
reflect the changes that physical synthesis performed.
If a particular set of registers consistently get retimed,
edit the RTL to retime the registers the same way. If the changes are made to
match what the physical synthesis algorithms did, the physical synthesis
options can be turned off to save compile time while getting the same type of
Evaluate Resource Usage
Evaluate a variety of resources used
in the design, including global and non-global signal usage, routing
utilization, and clustering difficulty.
Global and Non-Global Usage
For designs that contain many clocks, evaluate global and non-global signals to determine whether global resources are used effectively, and if not, consider making changes.
You can find these reports in the Resource section under Fitter in the Compilation Report panel.
The figure shows an example of inefficient use of a global clock. The highlighted line
has a single fan-out from a global clock.
Figure 13. Inefficient Use of a Global Clock
If you assign these resources to a Regional Clock, the Global Clock becomes available for another signal. You can ignore signals with an empty value in the Global Line Name column as the signal uses dedicated routing, and not a clock buffer.
The Non-Global High Fan-Out Signals report lists the
highest fan-out nodes not routed on global signals.
enable signals appear at the top of the list.
If there is routing congestion in the design,
and there are high fan-out non-global nodes in the congested area, consider using global
or regional signals to fan-out the nodes, or duplicate the high fan-out registers so
that each of the duplicates can have fewer fan-outs.
Use the Chip Planner to locate high fan-out
nodes, to report routing congestion, and to determine whether the alternatives are
Review routing usage reported in the Fitter Resource Usage Summary
Figure 14. Fitter Resource Usage Summary Report
Average interconnect usage reports the average
amount of interconnect that is used, out of what is available on the device.
Peak interconnect usage reports the largest amount of
interconnect used in the most congested areas.
Designs with an average value below 50% typically do not have any
problems with routing. Designs with an average between 50-65% may have difficulty
routing. Designs with an average over 65% typically have difficulty meeting timing
unless the RTL tolerates a highly utilized chip. Peak values at or above 90% are
likely to have problems with timing closure; a 100% peak value indicates that all
routing in an area of the device has been used, so there is a high possibility of
degradation in timing performance.
The figure shows the Report Routing
Figure 15. Routing Usage Summary Report
Wires Added for Hold
During routing the Fitter may add wire
between register paths to increase delay to meet hold time requirements. The Fitter
reports how much routing delay was added in the Estimated Delay Added for Hold Timing report.
Excessive additional wire can indicate an error with the constraint. The cause
of such errors is typically incorrect multicycle transfers between multi-rate clocks, and
between different clock networks.
Review the specific register paths in the Estimated Delay Added for Hold Timing
report to determine whether the Fitter adds excessive wire to meet hold timing.
Figure 16. Estimated Delay Added for Hold
An example of an incorrect
constraint which can cause the router to add wire for hold requirements is when
there is data transfer from 1x to 2x clocks. Assume the design intent is to allow
two cycles per transfer. Data can arrive any time in the two destination clock
cycles by adding a multicycle setup constraint as shown in the example:
set_multicycle_path -from 1x -to 2x -setup -end 2
The timing requirement is relaxed by one
2x clock cycle, as shown in the black line in the waveform in the figure.
Figure 17. Timing Requirement Relaxed
The default hold requirement, shown with the dashed blue line, can force the
router to add wire to guarantee that data is delayed by one cycle. To correct the hold
requirement, add a multicycle constraint with a hold option.
The orange dashed line in the figure above
represents the hold relationship, and no extra wire is required to delay the data.
The router can also add wire for hold timing
requirements when data transfers in the same clock domain, but between clock
branches that use different buffering. Transferring between clock network types
happens more often between the periphery and the core. The following figure shows
data is coming into a device, a periphery clock drives the source register, and a
global clock drives the destination register. A global clock buffer has larger
insertion delay than a periphery clock buffer. The clock delay to the destination
register is much larger than to the source register, hence extra delay is necessary
on the data path to ensure that it meets its hold requirement.
Figure 18. Clock Delay
To identify cases where a path has
different clock network types, review the path in the Timing Analyzer, and check
nodes along the source and destination clock paths. Also, check the source and
destination clock frequencies to see whether they are the same, or multiples, and
whether there are multicycle exceptions on the paths. Finally, ensure that all
cross-domain paths that are false by intent have an associated false path
If you suspect that routing is added to fix real
hold problems, you can disable the Optimize hold timing advanced Fitter
setting (Assignments > Settings > Compiler Settings > Advanced Settings (Fitter) > Optimize hold
timing). Recompile the design with Optimize hold timing disabled, and then
rerun timing analysis to identify and correct any paths that fail hold time
Figure 19. Optimize Hold Timing Option
the Optimize hold
timing option only when debugging your design. Ensure to enable the
option (default state) during normal compiles. Wire added for hold is a normal part
of timing optimization during routing and is not always a problem.
Evaluate Other Reports and Adjust Settings Accordingly
Difficulty Packing Design
In the Fitter Resource Section, under
the Resource Usage
Summary, review the Difficulty Packing Design report. The Difficulty Packing Design report details the effort level (low, medium,
or high) of the Fitter to fit the design into the device, partition, and Logic Lock region.
As the effort level of Difficulty Packing Design increases, timing closure gets harder.
Going from medium to high can result in significant drop in performance or increase
in compile time. Consider reducing logic to reduce packing difficulty.
Review Ignored Assignments
The Compilation Report includes details of any assignments ignored by the Fitter. Assignments typically get ignored if design names change, but assignments are not updated. Make sure any intended assignments are not being ignored.
Review Non-Default Settings
The reports from Synthesis and Fitter show non-default settings used in a compilation. Review the non-default settings to ensure the design benefits from the change.
Use the Chip Planner for reviewing placement.
You can use the Chip Planner to locate hierarchical entities, using colors for each
located entity in the floorplan. Look for logic that seems out of place, based on where
you expect it to be
For example, logic that interfaces with I/Os should be close to the I/Os,
and logic that interfaces with an IP or memory should be close to the IP or memory.
Figure 20. Floorplan with Color-Coded
The following notes describe how you can use the visualization in Floorplan with
Color-Coded Entities to check timing paths:
The green block is spread apart. Check to see if those paths are
failing timing, and if so, what connects to that module that could affect
The blue and aqua blocks are spread out and mixed together. Check if
connections between the two modules contribute to this.
The pink logic at the bottom must interface with I/Os at the bottom
edge. Check fan-in and fan-out of a highlighted module by using the buttons on the
Look for signals that go a
long way across the chip and see if they are contributing to timing failures.
Check global signal usage for signals that affect logic placement,
and verify if the Fitter placed logic feeding a global buffer close to the buffer
and away from related logic. Use settings like high fan-out on non-global resource
to pull logic together.
Check for routing congestion. The Fitter spreads out logic in highly
congested areas, making the design harder to route.
Evaluate Placement and Routing
Review duration of parts of compile time in Fitter messages. If routing takes much more time than placement, then meeting timing may be more difficult than the placer predicted.
Adjust Placement Effort
You can increase the Assignments > Settings > Compiler Settings > Advanced Settings (Fitter) > Placement Effort
Multiplier value to spend additional compilation time and effort in Place stage of
Adjust the multiplier after reviewing and optimizing other settings and RTL. Try an
increased value, up to 4, and reset to default if performance or compile time does not
Figure 21. Placement Effort Multiplier
Adjust Fitter Effort
Fitter Optimization mode settings allow you to specify whether the Compiler
focuses optimization efforts for performance, resource utilization, power, or compile
By default, the Fitter Optimization mode is set to Balanced (Normal
flow) mode, which reduces Fitter effort and compilation time as soon as
timing requirements are met. You can optionally select another Optimization mode to target performance, area,
routability, power, or compile time.
To increase Fitter effort further, you can also enable
the Assignments > Settings > Compiler Settings > Advanced Settings (Fitter) > Fitter Effort option. The default Auto Fit setting reduces Fitter effort once timing requirements are
met. Standard Fit (highest
effort) setting uses maximum effort regardless of the design's
requirements, leading to higher compilation time and more timing margin.
Figure 22. Fitter Effort
Review Timing Constraints
Ensure that clocks are constrained
with the correct frequency requirements. Using the
keeps generated clock settings updated. Timing Analyzer can be useful in reviewing
For example, under
Diagnostic in the Task
Report Ignored Constraints
report shows any incorrect names in the design, most commonly caused by changes
in the design hierarchy. Use the
Report Unconstrained Paths
report to locate unconstrained paths. Add constraints as
necessary so that the design can be optimized.
Evaluate Clustering Difficulty
You can evaluate clustering difficulty to help reach timing
You can monitor clustering difficulty whenever you add logic and recompile. Use the
clustering information to gauge how much timing closure difficulty is inherent in
If your design is full but clustering difficulty is low or medium, your
design itself, rather than clustering, is likely the main cause of
Conversely, congestion occurring after adding a small amount of logic to the
design, can be due to clustering. If clustering difficulty is high, this
contributes to congestion regardless of design size.
Review Details of Timing Paths
Show Timing Path Routing
Showing routing for a path can help uncover unusual routing
In the Timing Analyzer Report Timing
dialog box, enable the Report panel
name and Show
routing options, and click Report Timing.
Figure 23. Report Pane and Show Routing
The Extra Fitter Information tab shows a miniature
floorplan with the path highlighted. The
Extra Fitter Information tab is not
Stratix® 10 devices.
You can also locate the path in the Chip
Planner to examine routing congestion, and to view whether nodes in a path are placed
close together or far apart.
Routing paths allow you to identify global network buffers that fail timing. Buffer locations are named according to the network they drive.
CLK_CTRL_Gn—for Global driver
CLk_CTRL_Rn—for Regional driver
Buffers to access the global networks are
located in the center of each side of the device. Buffering to route a core logic signal
on a global signal network causes insertion delay. Tradeoffs to consider for global and
non-global routing are source location, insertion delay, fan-out, distance a signal
travels, and possible congestion if the signal is demoted to local routing.
If the register feeding the global buffer cannot be moved
closer, then consider changing either the design logic or the routing type.
If a global signal is required, consider adding half a cycle to
timing by using a negative-edge triggered register to generate the signal (top
figure) and use a multicycle setup constraint (bottom figure).
Nodes with very high fan-out that use local routing tend to pull
logic that they drive close to the source node. This can make other paths fail
timing. Duplicating registers can help reduce the impact of high fan-out paths.
Consider manually duplicating and preserving these registers.
MAX_FANOUT assignment may make
arbitrary groups of fan-out nodes, whereas a designer can make more intelligent
You can use the Global Signal assignment to control the global signal
usage on a per-signal basis. For example, if a signal needs local routing, you set the
Global Signal assignment to OFF.
Figure 26. Global Signal Assignment
Resets and Global Networks
Reset signals are often routed on global networks. Sometimes, the use of a global network causes recovery failures. Consider reviewing the placement of the register that generates the reset and the routing path of the signal.
Suspicious setup failures include paths with
very small or very large requirements.
One typical cause is math precision error. For example, 10Mhz/3 = 33.33 ns
per period. In three cycles, the time is 99.999 ns vs 100.000 ns. Setting a maximum delay
can provide an appropriate setup relationship.
Another cause of failure are paths that must be
false by design intent, such as:
Asynchronous paths handled through FIFOs,
Slow asynchronous paths that rely on
handshaking for data that remain available for multiple clock cycles.
To prevent the Fitter from having to meet unnecessarily restrictive timing requirements, consider adding false or multicycle path statements.
The Statistics tab in the Timing Analyzer path report shows the levels of logic in a path. If the path fails timing and the number of logic levels is high, consider adding pipelining in that part of the design.
Auto Shift Register Replacement
During Synthesis, the Compiler can convert shift
registers or register chains into RAMs to save area. However, conversion to RAM often
reduces speed. The names of the converted registers include "altshift_taps".
If paths that fail timing begin or end in shift registers, consider disabling the
Auto Shift Register
Replacement option. Do not convert registers that are intended for
For shift registers that are converted to a chain, evaluate area/speed trade off of
implementing in RAM or logic cells.
If a design is close to full, you can save area by shifting register conversion to RAM,
benefiting non-critical clock domains. You can change the settings from the default
AUTO to OFF globally, or on a register
or hierarchy basis.
For better timing results, place
registers driven by a regional clock in one quadrant of the chip. You can review the
clock region boundaries using the Chip Planner.
Timing failure can occur when the I/O interface at the
top of the device connects to logic driven by a regional clock which is in one quadrant
of the device, and placement restrictions force long paths to and from I/Os to logic
Use a different type of clock source to drive the
logic - global, which covers the whole device, or dual-regional which covers half the
device. Alternatively, you can reduce the frequency of the I/O interface to accommodate
the long path delays. You can also redesign the pinout of the device to place all the
specified I/Os adjacent to the regional clock quadrant. This issue can happen when
register locations are restricted, such as with Logic Lock regions, clocking resources, or hard blocks (memories, DSPs,
The Extra Fitter
Information tab in the Timing Analyzer timing report informs you when placement
is restricted for nodes in a path. The Extra Fitter
Information tab is not available for
The Report Timing Closure Recommendations task in the Timing Analyzer analyzes paths and provides specific recommendations based on path characteristics.
Adjusting and Recompiling
Look for obvious
problems that you can fix with minimal effort. To identify where the Compiler had trouble
meeting timing, perform seed sweeping with about five compiles. Doing so shows consistently
failing paths. Consider recoding or redesigning that part of the
To reach timing closure, a well written RTL can be
more effective than changing your compilation settings. Seed sweeping can also be useful if
the timing failure is very small, and the design has already been optimized for performance
improvements and is close to final release. Additionally, seed sweeping can be used for
evaluating changes to compilation settings. Compilation results vary due to the random
nature of fitter algorithms. If a compilation setting change produces lower average
performance, undo the change.
Sometimes, settings or constraints can cause more
problems than they fix. When significant changes to the RTL or design architecture have
been made, compile periodically with default settings and without Logic Lock regions, and re-evaluate paths that fail timing.
Partitioning often does not help timing closure,
and must be done at the beginning of the design process. Adding partitions can increase
logic utilization if it prevents cross-boundary optimizations, making timing closure harder
and increasing compile times.
Using Partitions to Achieve Timing Closure
One technique to achieve timing closure is confining failing paths within
individual design partitions, such that there are no failing paths passing between
partitions. You can then use incremental make changes as necessary to correct the
failing paths, and recompile only the affected partitions.
To use this technique:
In the Design Partition Planner, load timing data by clicking
View > Show Timing Data.
Entities containing nodes on failing paths appear in red in
the Design Partition Planner.
Extract the entity containing failing paths by dragging it
outside of the top-level entity window.
If there are no failing paths between the extracted
entity and the top-level entity, right-click the extracted entity, and then
click Create Design Partition to
place that entity in its own partition.
Keep failing paths within a partition, so that there are no
failing paths crossing between partitions.
If you are unable to isolate the failing paths from an
extracted entity so that none are crossing partition boundaries, return the
entity to its parent without creating a partition.
Find the partition having the worst slack value. For all the
other partitions, preserve the contents and set as
For information about preserving the contents of a partition,
refer to Incremental Block-Based Compilation Flow
Quartus® Prime Pro Edition User Guide: Block-Based Design
Adjust the logic in the partition and rerun the Fitter as
necessary until the partition meets the timing requirements.
Repeat the process for all other design partitions with
The initial compilation establishes whether the design achieves a successful fit and meets the specified timing requirements. This section describes how to analyze your design results in the
Quartus® Prime software.
Ignored Timing Constraints
Quartus® Prime software ignores illegal, obsolete, and conflicting constraints.
You can view a list of ignored constraints in
the Timing Analyzer GUI by clicking Reports > Report Ignored
Constraints or by typing the following command to generate a list of ignored timing
Timing Analyzer supports the
Design Constraints (SDC) format for constraining your design. When
using the Timing Analyzer for timing analysis, use the
set_input_delay constraint to
specify the data arrival time at an input port with respect to a given clock.
For output ports, use the
set_output_delay command to
specify the data arrival time at an output port’s receiver with respect to a
given clock. You can use the
report_timing Tcl command to
generate the I/O timing reports.
The I/O paths that do not meet the required
timing performance are reported as having negative slack and are highlighted in
red in the Timing Analyzer
Report pane. In cases
where you do not apply an explicit I/O timing constraint to an I/O pin, the
Quartus® Prime timing analysis software still reports the
Actual number, which is
the timing number that must be met for that timing parameter when the device
runs in your system.
Your design meets timing requirements when you do not have negative
slack on any register-to-register path on any of the clock domains. When timing requirements are not met, a report on the failed paths can uncover more
Displaying Path Reports with the Timing Analyzer
The Timing Analyzer generates reports with information about all
valid register-to-register paths.
To view all timing summaries, double-click
Report All Summaries in the Tasks pane.
If any clock domains have failing paths (highlighted in red in the
Report pane), right-click the clock name
listed in the Clocks Summary pane and select
Report Timing to get more details.
When you select a path in the Summary of
Paths tab, the path detail pane displays all the path information. The
Extra Fitter Information tab offers visual
representation of the path location on the physical device. This can reveal whether the
timing failure is distance related, due to the source and destination node being too
close or too far. The Extra
Fitter Information tab is not available for
Stratix® 10 devices.
The Data Path tab displays the
Data Arrival Path and the Data Required Path. You can determine the path segments
contributing the most to the timing violations with the incremental information. The
Waveform tab shows the signals in the time
domain, and plots the slack between arrival data and required data.
The Technology Map Viewer provides schematic, technology-mapped, representations of the design netlist, and can help you to assess which areas in a design can benefit from reducing the number of logic levels. To locate a timing path in one of the viewers, right-click a path in the timing report, point to Locate Path, and select Locate in Technology Map Viewer. You can also investigate the physical layout of a path in detail with the Chip Planner.
When you are analyzing failing paths,
examine the reports and waveforms to determine if the correct constraints are
being applied, and add timing exceptions as appropriate. A multicycle
constraint relaxes setup or hold relationships by the specified number of clock
A false path constraint specifies paths that can be ignored
during timing analysis. Both constraints allow the Fitter to work harder on
Focus on improving the paths that show the
worst slack. The Fitter works hardest on paths with the worst slack. If you fix
these paths, the Fitter might be able to improve the other failing timing paths in
Check for nodes that appear in many failing paths.
These nodes are at the top of the list in a timing report panel, along with their
minimum slacks. Look for paths that have common source registers, destination
registers, or common intermediate combinational nodes. In some cases, the registers
are not identical, but are part of the same bus.
In the timing analysis report panels, click
the From or
headings to sort the paths by source or destination registers. If you see common
nodes, these nodes indicate areas of your design that might be improved through
source code changes or
Quartus® Prime optimization
settings. Constraining the placement for just one of the paths might decrease the
timing performance for other paths by moving the common node further away in the
Tips for Analyzing Failing Clock Paths that Cross Clock Domains
When analyzing clock path
Check whether these paths cross two clock domains.
In paths that cross two clock domains, the From Clock and To Clock in the timing analysis report are different.
Figure 27. Different Value in From Clock and To Clock Field
Check if the design contains paths that involve a different
clock in the middle of the path, even if the source and destination register
clock are the same.
Check whether failing paths between these clock domains need to
be analyzed synchronously.
Set failing paths that are not to be analyzed synchronously as false paths.
When you run report_timing on a design, the report shows the launch clock and latch clock for each failing path. Check whether the relationship between the launch clock and latch clock is realistic and what you expect from your knowledge of the design
For example, the path can start at a rising edge and end at a falling edge, which reduces the setup relationship by one half clock cycle.
Review the clock skew that appears in the Timing Report:
A large skew may indicate a problem in the design, such as a gated clock, or a problem in the physical layout (for example, a clock using local routing instead of dedicated clock routing). When you have made sure the paths are analyzed synchronously and that there is no large skew on the path, and that the constraints are correct, you can analyze the data path. These steps help you fine tune your constraints for paths across clock domains to ensure you get an accurate timing report.
Check if the PLL phase shift is reducing the setup
You might adjust this by using PLL parameters and
Ignore paths that cross clock domains for logic protected with synchronization logic (for example, FIFOs or double-data synchronization registers), even if the clocks are related.
Set false path constraints on all unnecessary paths:
Attempting to optimize unnecessary paths can prevent the
Fitter from meeting the timing requirements on timing paths that are critical to
Tips for Analyzing Paths from/to the Source and Destination of Critical Path
When analyzing the failing paths in a design, it is often helpful to
get a fuller picture of the interactions around the paths.
To understand what may be pulling on a critical path, the following
can be useful.
In the project directory, run the report_timing command to find the nodes
in a critical path.
Copy the code below in a .tcl file, and replace the first two
variable with the node names from the From Node and To Node columns of the
worst path. The script analyzes the path between the worst source and
Execute this script in the Timing Analyzer after every compilation,
and add new report_timing
commands as new critical paths appear.
This helps you monitor paths that consistently fail and paths that are only marginal,
so you can prioritize effectively
Global Routing Resources
Global routing resources are designed to
distribute high fan-out, low-skew signals (such as clocks) without consuming regular
routing resources. Depending on the device, these resources can span the entire chip or
a smaller portion, such as a quadrant.
Quartus® Prime software attempts to assign signals to global routing resources
automatically, but you might be able to make more suitable assignments manually.
For details about the number and types of
global routing resources available, refer to the relevant device handbook.
Check the global signal utilization in your
design to ensure that the appropriate signals have been placed on the global
routing resources. In the Compilation Report, open the Fitter report and click
Analyze the Global & Other Fast Signals and Non-Global High Fan-out Signals
reports to determine whether any changes are required.
You might be able to reduce skew for high
fan-out signals by placing them on global routing resources. Conversely, you
can reduce the insertion delay of low fan-out signals by removing them from
global routing resources. Doing so can improve clock enable timing and control
signal recovery/removal timing, but increases clock skew. Use the
Global Signal setting in
the Assignment Editor to control global routing resources.
Use the following guidelines if your design does
not meet its timing requirements.
Displaying Timing Closure Recommendations for Failing Paths
Use the Timing Closure
Recommendations report to get specific recommendations about failing
paths in your design and changes that you can make to potentially fix the failing paths.
Stratix® 10 devices, use the Fast Forward Timing Closure Recommendations report.
In the Tasks pane of the
Timing Analyzer, select the Report Timing Closure
Recommendations task to open the Report Timing Closure Recommendations dialog box.
Select paths based on the clock domain, filter by nodes on
path, and choose the number of paths to analyze.
After running the Report Timing
Closure Recommendations task in the Timing Analyzer, examine the
reports in the Report Timing Closure
Recommendations folder in the Report pane of the Timing Analyzer GUI. Each recommendation has
star symbols (*) associated with it. Recommendations with more stars are more
likely to help you close timing on your design.
The reports give you the most probable causes of failure for each
analyzed path, and show recommendations that may help you fix the failing paths.
The reports are organized into sections, depending on the type of
issues found in the design, such as large clock skew, restricted optimizations,
unbalanced logic, skipped optimizations, coding style that has too many levels of
logic between registers, or region or partition constraints specific to your
For detailed analysis of the critical paths, run the report_timing command on specified paths. In the
Extra Fitter Information tab of the
Path report panel, you can see detailed
fitter-related information that may help you visualize the issue. The Extra Fitter
Information tab is not available for
Stratix® 10 devices.
While the Timing Analyzer Report Timing Closure Recommendations task
gives specific recommendations to fix failing paths, the Timing Optimization Advisor
gives more general recommendations to improve timing performance for a design.
The Timing Optimization Advisor guides you in making
settings that optimize your design to meet your timing requirements. To run the Timing
> Advisors > Timing Optimization Advisor.
describes many of the suggestions made in this section.
When you open the Timing Optimization Advisor after
compilation, you can find recommendations to improve the timing performance of your
design. If suggestions in these advisors contradict each other, evaluate these options
and choose the settings that best suit the given requirements.
The example shows the Timing Optimization
Advisor after compiling a design that meets its frequency requirements, but
requires setting changes to improve the timing.
Figure 29. Timing Optimization Advisor
When you expand one of the categories in the Timing
Optimization Advisor, such as Maximum Frequency (fmax) or I/O Timing (tsu, tco, tpd), the
recommendations appear in stages. These stages show the order in which to apply the
The first stage contains the options that are
easiest to change, make the least drastic changes to your design optimization, and have
the least effect on compilation time.
Icons indicate whether each recommended setting has
been made in the current project. In the figure, the checkmark icons in the list of
recommendations for Stage 1 indicates recommendations that are already implemented. The
warning icons indicate recommendations that are not followed for this compilation. The
information icons indicate general suggestions. For these entries, the advisor does not
report whether these recommendations were followed, but instead explains how you can
achieve better performance. For a legend that provides more information for each icon,
refer to the “How to use” page in the Timing Optimization Advisor.
Each recommendation provides a link to the
appropriate location in the
Quartus® Prime GUI where you
can change the settings. For example, consider the Synthesis Netlist Optimizations page of
the Settings dialog
box or the Global Signals category
in the Assignment Editor. This approach provides the most control over which
settings are made and helps you learn about the settings in the software. When
available, you can also use the Correct the Settings button to automatically make the suggested change
to global settings.
For some entries in the Timing Optimization Advisor,
a button allows you to further analyze your design and see more information. The advisor
provides a table with the clocks in the design, indicating whether they have been
assigned a timing constraint.
Optional Fitter Settings
This section focuses only on the optional
timing-optimization Fitter settings, which are the Optimize Hold Timing, Optimize Multi-Corner Timing,
and Fitter Aggressive Routability
settings that best optimize different designs might vary. The group of settings that
work best for one design does not necessarily produce the best result for another
The Optimize Hold Timing option directs the
Quartus® Prime software to optimize minimum delay
When you turn on Optimize Hold Timing in the Advanced Fitter Settings dialog
Quartus® Prime software adds delay to paths to
ensure that your design meets the minimum delay requirements. If you select I/O Paths and Minimum TPD
Paths, the Fitter works to meet the following criteria:
Hold times (tH) from the
device input pins to the registers
Minimum delays from I/O pins to I/O registers or from I/O registers to I/O pins
Minimum clock-to-out time (tCO) from registers to output pins
If you select All Paths, the Fitter also works to meet hold
requirements from registers to registers, as highlighted in blue in the figure, in which
a derived clock generated with logic causes a hold time problem on another register.
Figure 30. Optimize Hold Timing Option Fixing an
Internal Hold Time Violation
However, if your design still has internal hold
time violations between registers, you can manually add delays by instantiating LCELL
primitives, or by making changes to your design, such as using a clock enable signal
instead of a derived or gated clock.
Fitter Aggressive Routability
Optimizations logic option allows you to specify whether the Fitter
aggressively optimizes for routability. Performing aggressive routability
optimizations may decrease design speed, but may also reduce routing wire usage
and routing time.
This option is useful if routing resources are
resulting in no-fit errors, and you want to reduce routing wire use.
The table lists the settings for the Fitter Aggressive Routability
Optimizations logic option.
The Fitter always performs
aggressive routability optimizations. If you set the Fitter
Aggressive Routability Optimizations logic option to
Always, reducing wire utilization may affect
the performance of your design.
The Fitter never performs
aggressive routability optimizations. If improving timing is more
important than reducing wire usage, then set this option to
The Fitter performs aggressive
routability optimizations automatically, based on the routability and
timing requirements of the design. If improving timing is more important
than reducing wire usage, then set this option to
I/O Timing Optimization Techniques
This stage of design optimization focuses
on I/O timing, including setup delay (tSU), hold time (tH),
and clock-to-output (tCO)
Before proceeding with I/O timing
optimization, ensure that:
The design's assignments follow the suggestions in the Initial Compilation: Required Settings section of
the Design Optimization Overview chapter.
Resource utilization is satisfactory.
this stage before proceeding to the register-to-register timing optimization stage.
Changes to the I/O paths affect the internal register-to-register timing.
Summary of Techniques for Improving Setup and Clock-to-Output Times
The table lists the recommended order of techniques to reduce tSU and tCO times. Reducing tSU times increases hold (tH) times.
Note: Verify which options are available to each device family
Table 14. Improving Setup and Clock-to-Output
that the appropriate constraints are set for the failing I/Os (refer
to Initial Compilation: Required
timing-driven compilation for I/O (refer to Fast
Input, Output, and Output Enable Registers)
input register (refer to Programmable
output register, fast output enable register, and fast OCT register
(refer to Programmable Delays)
the value of Input Delay from Pin to Input Register or set
Input Delay to Input Register = ON
the value of Input Delay from Pin to Internal Cells or set
Input Delay to Internal Cells = ON
the value of Delay from Output Register to Output Pin or set
Delay to Output Pin = OFF (refer to Fast Input, Output, and Output Enable
the value of Input Delay from Dual-Purpose Clock Pin to Fan-Out
Destinations (refer to Fast
Input, Output, and Output Enable Registers)
Use PLLs to shift
the value of Delay to output enable pin or set Increase delay to
output enable pin (refer to Use
PLLs to Shift Clock Edges)
Optimize IOC Register Placement for Timing Logic Option
This option moves registers into I/O
elements to meet tSU or tCO assignments, duplicating the
register if necessary (as in the case in which a register fans out to multiple output
This option is turned on by default and is a global
The Optimize IOC Register Placement for Timing
logic option affects only pins that have a tSU or tCO requirement. Using the I/O register is possible only if the register
directly feeds a pin or is fed directly by a pin. Therefore, this logic option does not
affect registers with any of the following characteristics:
Note: To optimize registers with these characteristics, use
Quartus® Prime Fitter optimizations.
Have combinational logic between the register and the
of a carry chain
Have an overriding location assignment
Use the asynchronous load port and the value is not 1
(in device families where the port is available)
You can place
individual registers in I/O cells manually by making fast I/O assignments with the
By default, with correct timing assignments, the Fitter
places the I/O registers in the correct I/O cell or in the core, to meet the performance
If the fast I/O setting is on, the register is
always placed in the I/O element. If the fast I/O setting is off, the register is never
placed in the I/O element. This is true even if the Optimize IOC Register Placement for Timing option
is turned on. If there is no fast I/O assignment, the
Quartus® Prime software determines whether to place registers in I/O elements if the
Optimize IOC Register Placement for
Timing option is turned on.
You can also use the four fast I/O options
(Fast Input Register,
Fast Output Register,
Fast Output Enable
Register, and Fast OCT
Register) to override the location of a register that is in a Logic Lock region and force it into an I/O cell. If you apply
this assignment to a register that feeds multiple pins, the Fitter duplicates the register
and places it in all relevant I/O elements.
For more information about the Fast Input Register option,
Fast Output Register
option, Fast Output Enable
Register option, and Fast OCT (on-chip termination) Register option, refer to
Quartus® Prime Help.
You can use various programmable delay
options to minimize the tSU and
Programmable delays are advanced options that you use only after you compile a
project, check the I/O timing, and determine that the timing is unsatisfactory.
Quartus® Prime software automatically
adjusts the applicable programmable delays to help meet timing requirements. For
detailed information about the effect of these options, refer to the device family
handbook or data sheet.
After you have made a programmable delay
assignment and compiled the design, you can view the implemented delay values for every
delay chain and every I/O pin in the Delay Chain Summary section of the Compilation Report.
You can assign programmable delay options to
supported nodes with the Assignment Editor. You can also view and modify the delay chain
setting for the target device with the Chip Planner and Resource Property Editor. When
you use the Resource Property Editor to make changes after performing a full
compilation, recompiling the entire design is not necessary; you can save changes
directly to the netlist. Because these changes are made directly to the netlist, the
changes are not made again automatically when you recompile the design. The change
management features allow you to reapply the changes on subsequent compilations.
Although the programmable delays in newer
devices are user-controllable, Intel
recommends their use for advanced users only. However, the
Quartus® Prime software might use the programmable delays internally during the
For details about the programmable delay logic
options available for Intel devices, refer to
Quartus® Prime Help topics:
Using a PLL typically improves I/O
If the timing requirements are still not met, most
devices allow the PLL output to be phase shifted to change the I/O timing.
Shifting the clock backwards gives a better tH at the expense of tSU, while shifting it forward gives a better
tSU at the expense of tH. You can use this technique only in devices
that offer PLLs with the phase shift option.
Figure 31. Shift Clock Edges Forward to Improve
tSU at the Expense of tH
You can achieve the same type of effect in
certain devices by using the programmable delay called
Input Delay from Dual Purpose Clock
Pin to Fan-Out Destinations.
Use Fast Regional Clock Networks and Regional Clocks Networks
provide the lowest clock delay and skew for logic contained in a single quadrant.
In general, fast regional clocks have less delay to I/O elements than regional and
global clocks, and are used for high fan-out control signals. Placing clocks on these low-skew
and low-delay clock nets provides better tCO performance.
Intel devices have a variety of hierarchical clock
structures. These include dedicated global clock networks, regional clock networks, fast
regional clock networks, and periphery clock networks. The available resources differ
between the various Intel device
For the number of clocking resources available in
your target device, refer to the appropriate device handbook.
Spine Clock Limitations
In projects with
high clock routing demands, limitations in the
software can cause spine clock errors.
These errors are often seen with designs
using multiple memory interfaces and high-speed serial interface (HSSI) channels, especially
PMA Direct mode.
networks, regional clock networks, and periphery clock networks have an additional level of
clock hierarchy known as spine clocks. Spine clocks drive the final row and column clocks
to their registers; thus, the clock to every register in the chip is reached through spine
clocks. Spine clocks are not directly user controllable.
To reduce these spine clock errors, constrain your
design to use your regional clock resources better:
If your design does not use Logic Lock regions, or if the Logic Lock regions are not aligned to your clock region boundaries, create
additional Logic Lock regions and further constrain
If Periphery features ignore Logic Lock region assignment, possibly because the global
promotion process is not functioning properly. To ensure that the global promotion
process uses the correct locations, assign specific pins to the I/Os using these
By default, some
Intel® FPGA IP functions apply a global signal assignment with a value of
dual-regional clock. If you constrain your logic to a regional clock region and set the
global signal assignment to Regional instead of Dual-Regional, you can reduce clock resource contention.
The next stage of design optimization
seeks to improve register-to-register (fMAX) timing.
The following sections provide available options if
the design does not meet timing requirements after compilation.
Coding style affects the performance of a design
to a greater extent than other changes in settings. Always evaluate the code and make
sure to use synchronous design practices.
Note: In the context of the Timing Analyzer, register-to-register timing optimization is the same as maximizing the slack on the clock domains in a design. The techniques in this section can improve the slack on different timing paths in the design.
Before performing design optimizations, understand the structure of the design as well as the effects of techniques in different types of logic. Techniques that do not benefit the logic structure can decrease performance.
In many cases, optimizing the design’s
source code can have a very significant effect on your design performance.
fact, optimizing your source code is typically the most effective technique for
improving the quality of your results and is often a better choice than using
Logic Lock or location assignments.
Be aware of the number of logic levels needed to
implement your logic while you are coding. Too many levels of logic between registers
might result in critical paths failing timing. Try restructuring the design to use
pipelining or more efficient coding techniques. Also, try limiting high fan-out signals
in the source code. When possible, duplicate and pipeline control signals. Make sure the
duplicate registers are protected by a preserve attribute, to avoid merging during
If the critical path in
your design involves memory or DSP functions, check whether you have code
blocks in your design that describe memory or functions that are not being
inferred and placed in dedicated logic. You might be able to modify your source
code to cause these functions to be placed into high-performance dedicated
memory or resources in the target device. When using RAM/DSP blocks, enable the
optional input and output registers.
Ensure that your state machines are recognized
as state machine logic and optimized appropriately in your synthesis tool.
State machines that are recognized are generally optimized better than if the
synthesis tool treats them as generic logic. In the
Quartus® Prime software, you
can check the State Machine report under
Analysis & Synthesis
in the Compilation Report. This report provides details, including state
encoding for each state machine that was recognized during compilation. If your
state machine is not recognized, you might have to change your source code to
enable it to be recognized.
The choice of options and settings to
improve the timing margin (slack) or to improve register-to-register timing depends on
the failing paths in the design.
To achieve the results that best
approximate your performance requirements, apply the following techniques and compile the
design after each step:
Ensure that your timing assignments
are complete and correct. For details, refer to the Initial Compilation: Required Settings section in the Design Optimization Overview chapter.
Review all warning messages from your
initial compilation and check for ignored timing assignments.
Apply netlist synthesis optimization
To optimize for speed, apply the
following synthesis options:
Optimize Synthesis for Speed, Not Area
Flatten the Hierarchy During Synthesis
Set the Synthesis Effort to High
Prevent Shift Register Inference
Use Other Synthesis Options Available in Your Synthesis
To optimize for performance, turn on Advanced Physical Optimization
Try different Fitter seeds. If only a
small number of paths are failing by small negative slack, then you can try with
a different seed to find a fit that meets constraints in the Fitter seed
Note: Omit this step if a large number of
critical paths are failing, or if the paths are failing by a long margin.
To control placement, make Logic Lock assignments.
Modify your design source code to fix
areas of the design that are still failing timing requirements by significant
Make location assignments, or as a
last resort, perform manual placement by back-annotating the design.
You can use Design Space Explorer II (DSE) to automate the
process of running different compilations with different settings.
If these techniques do not achieve performance requirements,
additional design source code modifications might be required.
Quartus® Prime software offers physical synthesis optimizations that can help improve design performance regardless of the synthesis tool.
You can apply physical synthesis optimizations both during synthesis and during fitting.
During the synthesis stage of the
Quartus® Prime compilation, physical synthesis optimizations operate either on
the output from another EDA synthesis tool, or as an intermediate step in synthesis.
These optimizations modify the synthesis netlist to improve either area or speed,
depending on the technique and effort level you select.
To view and modify the synthesis netlist optimization options, click
Assignments > Settings > Compiler Settings > Advanced Settings (Fitter).
If you use a third-party EDA synthesis tool and want to determine if the
Quartus® Prime software can remap the circuit to improve performance, use the Perform WYSIWYG Primitive Resynthesis option. This option directs the
Quartus® Prime software to un-map the LEs in an atom netlist to logic gates, and then map the gates back to Intel-specific primitives. Intel-specific primitives enable the Fitter to remap the circuits using architecture-specific techniques.
Quartus® Prime technology mapper
optimizes the design to achieve maximum speed performance, minimum area usage, or
balances high performance and minimal logic usage, according to the setting of the
Optimization Technique option. Set this
option to Speed or Balanced.
During the Fitter stage of the
Quartus® Prime compilation, physical synthesis optimizations make placement-specific changes to the
netlist that improve speed performance results for the specific Intel device.
Design performance varies depending on coding
style, synthesis tool used, and options you specify when synthesizing. Change your
synthesis options if a large number of paths are failing, or if specific paths fail by a
great margin and have many levels of logic.
Identify the default optimization targets of your
Synthesis tool, and set your device and timing constraints accordingly. For example, if
you do not specify a target frequency, some synthesis tools optimize for area.
You can specify logic options for specific modules in
your design with the Assignment Editor while leaving the default Optimization Technique setting
at Balanced (for the
best trade-off between area and speed for certain device families) or Area (if area is an important
concern). You can also use the Speed Optimization Technique for Clock Domains option in the Assignment
Editor to specify that all combinational logic in or between the specified clock domains
are optimized for speed.
To achieve best performance with push-button compilation, follow the
recommendations in the following sections for other synthesis settings. You can use DSE
II to experiment with different
Quartus® Prime synthesis options to optimize your design for
the best performance.
Synthesis tools typically let you
preserve hierarchical boundaries, which can be useful for verification or other
However, the best optimization results generally occur when the
synthesis tool optimizes across hierarchical boundaries, because doing so often allows the
synthesis tool to perform the most logic minimization, which can improve performance.
Whenever possible, flatten your design hierarchy to achieve the best results.
Set the Synthesis Effort to High
Synthesis tools offer varying synthesis effort
levels to trade off compilation time with synthesis results. Set the synthesis effort to
high to achieve best
results when applicable.
Duplicate Registers for Fan-Out Control
Often, timing failures can occur due to the
influence of signals that are not directly involved in the failing transfers. This
condition tends to manifest when off-critical nets, most commonly with a high fan-out,
span a large distance and consequentially, warp the optimization of other paths around
Duplicating the sources of these types of globally-influential signals can help to disperse
them across many hops, or even across many clock cycles, and focus more on local
For example, by duplicating a high fan-out signal in the form of a tree of registers, you can
disperse the signal over several clock cycles. As the signal progresses down the tree,
it progressively feeds more into local copies of the original registers, such that any
individual register's destinations are well-localized and its influence on register
optimization is minimal. The key to this optimization is to determine how to assign the
original signal’s fan-outs among the duplicates. If any individual register requires
driving a large distance, the benefit of the tree can be removed.
You can manually create a register tree and group the endpoints in the RTL by leveraging your
system-level knowledge about how best to disperse the signal throughout your design, but
it can be time consuming and have a widespread impact. For more information about
manually creating a register tree, refer to Manual Register Duplication.
You can create register trees automatically in one of the following ways. Each method has
its own methodology to determine the number of duplicates to create and how to assign
the fan-outs between the duplicates:
Synthesis tools support options or attributes that
specify the maximum fan-out of a register. When using
synthesis, you can set the Maximum Fan-Out logic option in the
Assignment Editor to control the number of destinations for a node so that the fan-out count
does not exceed a specified value. You can also use the maxfan attribute in your HDL code. The software
duplicates the node as required to achieve the specified maximum fan-out.
Logic duplication using Maximum Fan-Out assignments normally increases
resource utilization, and can potentially increase compilation time, depending on the
placement and the total resource usage within the selected device.
The improvement in timing performance that results
from Maximum Fan-Out
assignments is design-specific. This is because when you use the Maximum Fan-Out assignment, the Fitter duplicates the
source logic to limit the fan-out, but does not control the destinations that each of the
duplicated sources drive. Therefore, it is possible for duplicated source logic to be driving
logic located all around the device. To avoid this situation, you can use the Manual Logic Duplication logic
If you are using Maximum Fan-Out assignments, benchmark your design
with and without these assignments to evaluate whether they give the expected improvement in
timing performance. Use the assignments only when you get improved results.
You can manually duplicate registers in the
Quartus® Prime software regardless of the synthesis tool used.
To duplicate a register, apply the Manual
Logic Duplication logic option to the register with the Assignment Editor.
Fitter optimizations may cause a small violation to the Maximum Fan-Out assignments to improve timing.
register_name is the register to duplicate. To create a register tree
from a chain, create a unique assignment for each register in the chain.
DUPLICATE_REGISTER assignments are processed in the appropriate order if
they apply to registers that drive each other in a chain.
num_duplicates is the number of duplicates of the register to create
(including the original). If the original signal has M fan-out, the average
fan-outs of the duplicates will be M/N but any individual duplicate may
have more or fewer, at the discretion of the algorithm.
The DUPLICATE_REGISTER assignment is processed during the Fitter stage. It
is necessary to create the duplicates and assign fan-outs between the duplicates based on
early estimates of physical proximity to maximize the amount of time spent optimizing the
design post-duplication. However, this renders fine-grained assignment decisions imprecise.
The DUPLICATE_REGISTER assignment is best used when the number of duplicates
is small (under 100) and the groups created are coarse-grained enough to allow for flexibility
during optimization after the duplicates are created.
The Fitter Duplication Summary panel of the Fit report details the
DUPLICATE_REGISTER assignments picked up by
Quartus® Prime Pro Edition. It also summarizes any registered signal with greater than 1000
fan-outs, as they could be reasonable candidates for DUPLICATE_REGISTER
assignments in future.
register_name is the last register in a chain that fans out to multiple
hierarchies. To create a register tree, ensure that there are sufficient simple registers
behind the node and those simple registers are automatically pulled into the tree.
num_levels corresponds to the upper bound of the number of registers that
exist in the chain to use for duplicating down the hierarchies.
The DUPLICATE_HIERARCHY_DEPTH assignment is
processed during the Synthesis stage. It is common for high-fanout signals to go through a
pipeline of registers and drive into a sub-hierarchy of modules. For example, a system-wide
reset can be propagated over several clock cycles and driven into many modules across the
design. In several scenarios, it is useful to take advantage of the structure of this
sub-hierarchy to infer the structure of the register tree to be created, such that endpoints
within similar hierarchies are assigned the same copy of the signal, and branches in the
design hierarchy dictates where to place branches in the register tree.
Consider the following example illustration of a netlist with a register
chain and hierarchical organization of the endpoints it drives. The DUPLICATE_HIERARCHY_DEPTH assignment duplicates the pipeline registers across
hierarchies, as shown in Figure 35.
Figure 32. Original Diagram Showing Four Pipeline Registers Connected to
In this case, regZ is the appropriate
assignment target as it is the endpoint in a chain of four registers. There is a maximum of
three duplication candidates in this example (regZ, regY, and regX), so the
assignment value can be anywhere between 1 and 3. regA is
not pulled into the hierarchy to preserve the timing and optimization of paths that precede
it. The DUPLICATE_HIERARCHY_DEPTH assignment is best used
when a signal must be duplicated to more than 100 duplicates and the sub-hierarchy below the
chain is deep and meaningful enough to guide the structure of the tree required.
Figure 33. Netlist After Duplicating regZ to Hierarchy Level One set_instance_assignment -name DUPLICATE_HIERARCHY_DEPTH -to regZ
When num_levels is set to 1, only regZ is pulled out of
the chain and pushed down one hierarchy level into its fan-out tree.
Figure 34. Netlist After Duplicating regZ to Hierarchy Level Twoset_instance_assignment -name DUPLICATE_HIERARCHY_DEPTH -to regZ
When num_levels is set to 2, both regY and regZ are pulled out of
the chain. regZ ends up at a maximum hierarchy depth two
and regY ends up at hierarchy depth one.
When num_levels is set to 3, all three registers (regZ,
regY and regZ) are pulled out of the chain and pushed to
a maximum hierarchy depth of three, two, and one levels, respectively.
The Hierarchical Tree Duplication Summary panel in the Synthesis
report provides information on the registers specified by the
DUPLICATE_HIERARCHY_DEPTH assignment. It also includes a reason for the
chain length that can be used as a starting point for further improvements with the
assignment. The Synthesis report also provides a panel named Hierarchical Tree
Duplication Details, which provides information about the individual registers
in the chain that can be used to better understand the structure of the implemented
Prevent Shift Register Inference
Turning off the inference of shift registers
can increase performance.
This setting forces the software to use logic cells to
implement the shift register, instead of using the ALTSHIFT_TAPS IP core to implement the registers in memory block. If you implement shift registers in logic cells instead
of memory, logic utilization increases.
Use Other Synthesis Options Available in Your Synthesis Tool
With your synthesis tool, experiment with the following options if they are available:
Turn on register balancing or retiming
Turn on register pipelining
Turn off resource sharing
These options can increase performance, but typically increase the resource utilization of your design.
The Fitter seed affects the initial
placement configuration of the design.
Any change in the initial conditions
changes the Fitter results; accordingly, each seed value results in a somewhat different
fit. You can experiment with different seeds to attempt to obtain better fitting results and
Changes in the design impact performance between compilations. This random
variation is inherent in placement and routing algorithms—it is impossible to try all
seeds and get the absolute best result.
design change that directly or indirectly affects the Fitter has the same type of random
effect as changing the seed value. This includes any change in source files, Compiler Settings or Timing Analyzer Settings. The same effect can
appear if you use a different computer processor type or different operating system,
because different systems can change the way floating point numbers are calculated in
If a change in optimization settings marginally
affects the register-to-register timing or number of failing paths, you cannot always be
certain that your change caused the improvement or degradation, or whether it is due to
random effects in the Fitter. If your design is still changing, running a seed sweep
(compiling your design with multiple seeds) determines whether the average result
improved after an optimization change, and whether a setting that increases compilation
time has benefits worth the increased time, such as with physical synthesis settings.
The sweep also shows the amount of random variation to expect for your design.
If your design is finalized you can compile
your design with different seeds to obtain one optimal result. However, if you
subsequently make any changes to your design, you might need to perform seed sweep
Click Assignments > Compiler Settings to control the initial placement with the seed. You can use the DSE II to
perform a seed sweep easily.
To specify a Fitter seed use the following Tcl
To improve routability in designs where
the router did not pick up the optimal routing lines, set the Router Timing Optimization Level to Maximum.
setting determines how aggressively the router tries to meet the timing requirements.
Setting this option to Maximum can marginally increase design speed at the cost of increased
compilation time. Setting this option to Minimum can reduce compilation time at the cost of marginally reduced
design speed. The default value is Normal.
If a small number of
paths are failing to meet their timing requirements, you can use hard location assignments
to optimize placement.
Location assignments are less flexible for the
Quartus® Prime Fitter
than Logic Lock assignments. Additionally, if you are
familiar with your design, you can enter location constraints in a way that produces better
Note: Improving fitting results, especially for larger devices, such as
Stratix® series devices, can be
difficult. Location assignments do not always improve the performance of the design. In
many cases, you cannot improve upon the results from the Fitter by making location
Metastability Analysis and Optimization Techniques
Metastability problems can occur when a
signal is transferred between circuitry in unrelated or asynchronous clock domains,
because the designer cannot guarantee that the signal meets its setup and hold time
The mean time between failures (MTBF) is an estimate of the
average time between instances when metastability could cause a design failure.
You can use the
Quartus® Prime software to analyze the average MTBF due to metastability when a
design synchronizes asynchronous signals and to optimize the design to improve the MTBF.
These metastability features are supported only for designs constrained with the
Timing Analyzer, and for select device families.
If the MTBF of your design is low, refer to the
Metastability Optimization section in the Timing Optimization Advisor, which suggests
various settings that can help optimize your design in terms of metastability.
This chapter describes how to enable
metastability analysis and identify the register synchronization chains in your design,
provides details about metastability reports, and provides additional guidelines for
Note: This section applies only to designs targeting the
Stratix® 10 family of devices. Other families do not have the
capabilities described in this section.
In traditional FPGA timing closure flows, the starting point for most
design analysis is the critical path. Due to the nature of
Stratix® 10 devices and the availability of the Hyper Retimer, it is best to
start you timing closure activities from the Retiming Limit Report. You want to give the
tool as many optimization opportunities as possible before having to look into more time
intensive and potentially manual timing closure techniques.
Use the Retiming Limit Details report to get specific information
on what is currently limiting the Hyper Retimer from performing more
The Retiming Limit Details report specifies:
Clock Transfer: Clock domain, or the clock domain transfer for which
the critical chain applies
Limiting Reason: Design conditions which prevent further
optimizations from happening.
Critical Chain Details: Timing paths associated with the timing
Using the Retiming Limit Details Report
To access the Retiming Limit Details report:
In the Reports tab,
double-click Retiming Limit Details under
Fitter > Retime Stage.
To locate the critical chain in the Technology Map Viewer,
right-click any path and click Locate Critical Chain
in Technology Map Viewer.
The Technology Map Viewer displays a schematic representation
of the complete critical chain after place, route and register retiming.
Figure 36. Critical Chain in Technology Map Viewer
Fast Forward Timing Closure Recommendations
When running Fast Forward compilation, the Compiler removes signals from
registers to allow mobility within the netlist for subsequent retiming. Fast Forward
compilation generates design-specific timing closure recommendations, and predicts maximum
performance with removal of all timing restrictions.
After you complete Fast Forward explorations, you can determine which
recommendations to implement to provide the most benefit. Implement appropriate
recommendations in your RTL, and recompile the design to achieve the performance levels that
Fast Forward reports.
The Fast Forward Details Report provides the following information:
Table 15. Fast Forward Details Report Information
Displays the various Fast Forward optimization steps, starting
from the pre-optimization base compilation.
Each step comes with its associated critical chain.
Each step corresponds to a new optimization cumulative to the
Fast Forward Optimization
Analyzed Summary of the optimizations necessary to implement each
Estimated fMAX performance after you
implement the recommendations for this step in your design. This is cumulative, and
step n represents the potential fMAX after
implementing all previous steps.
(cumulative) List of all the consecutive optimization steps
Recommendation for Critical Chain
Lists recommended changes to your designs. These recommendations
are geared towards removing retiming limitations, and allowing register
Generating Fast Forward Timing Closure Recommendations
To generate Fast Forward timing closure recommendations:
On the Compilation Dashboard, click Fast Forward Timing Closure Recommendations.
The Compiler runs prerequisite synthesis or Fitter
stages as needed, and generates timing closure recommendations in the
View timing closure recommendations in the Compilation Report to
evaluate design performance, and implement key RTL performance
Quartus® Prime Pro Edition software allows you to automate or refine Fast Forward analysis:
To run Fast Forward compilation during each full compilation,
click Assignments > Settings > Compiler Settings > HyperFlex, and turn on Run Fast Forward Timing
Closure Recommendations during compilation.
To modify how Fast Forward compilation interprets specific I/O
and block types, click Assignments > Settings > Compiler Settings > HyperFlex Advanced Settings.
Implementing Fast Forward Recommendations
After implementing timing closure recommendations in your design, you can
rerun the Retime stage to obtain the predictive performance gains.
You can continue exploring performance and implementing RTL changes to your
code until you reach the desired performance target. Once you have completed all the
modifications you want to do, continue your timing closure activities with the traditional
techniques explained in this document.
For more information about implementing Fast Forward timing closure
recommendations in your design, refer to the Implement Fast Forward
Recommendations section of the
Stratix® 10 High-Performance Design Handbook
Periphery to Core Register Placement and Routing Optimization
The Periphery to Core Register Placement and Routing Optimization
(P2C) option specifies whether the Fitter performs targeted placement and routing optimization
on direct connections between periphery logic and registers in the FPGA core.
P2C is an
optional pre-routing-aware placement optimization stage that enables you to more reliably
achieve timing closure.
Note: The Periphery to Core Register Placement and Routing Optimization
option applies in both directions, periphery to core and core to periphery.
Transfers between external interfaces (for example, high-speed I/O or serial
interfaces) and the FPGA often require routing many connections with tight setup and hold
timing requirements. When this option is turned on, the Fitter performs P2C placement and
routing decisions before those for core placement and routing. This reserves the necessary
resources to ensure that your design achieves its timing requirements and avoids routing
congestion for transfers with external interfaces.
This option is available as a global assignment, or can be applied to specific
instances within your design.
Figure 37. Periphery to Core Register Placement and Routing Optimization (P2C) FlowP2C runs after periphery placement, and generates placement for core registers on
corresponding P2C/C2P paths, and core routing to and from these core registers.
Setting Periphery to Core Optimizations in the Advanced Fitter Setting Dialog Box
The Periphery to Core Placement and Routing
Optimization setting specifies whether the Fitter optimizes targeted
placement and routing on direct connections between periphery logic and registers in the
You can optionally perform periphery to core optimizations by instance with
settings in the Assignment Editor.
Quartus® Prime software,
click Assignments > Settings > Compiler Settings > Advanced Settings (Fitter).
In the Advanced Fitter Settings dialog box, for the
Periphery to Core Placement and Routing
Optimization option, select one of the following options
depending on how you want to direct periphery to core optimizations in your
Select Auto to
direct the software to automatically identify transfers with tight
timing windows, place the core registers, and route all connections to
or from the periphery.
Select On to
direct the software to globally optimize all transfers between the
periphery and core registers, regardless of timing requirements.
Note: Setting this option to On in the
Advanced Fitter Settings is not
recommended. The intended use for this setting is in the Assignment
Editor to force optimization for a targeted set of nodes or
Select Off to disable periphery to
core path optimization in your design.
Setting Periphery to Core Optimizations in the Assignment Editor
When you turn on the Periphery to Core
Placement and Routing Optimization (P2C/C2P) setting in the Assignment
Quartus® Prime software performs periphery to
core, or core to periphery optimizations on selected instances in your design.
You can optionally perform periphery to core optimizations by instance with
settings in the Advanced Fitter Settings dialog box.
Quartus® Prime software,
click Assignments > Assignment Editor.
For the selected path, double-click the Assignment Name column, and then click the
Periphery to core register placement and routing
optimization option in the drop-down list.
In the To column, choose either a periphery node or
core register node on a P2C/C2P path you want to optimize. Leave the
From column empty.
For paths to appear in the Assignments Editor, you must first run
Analysis & Synthesis on your design.
Viewing Periphery to Core Optimizations in the Fitter Report
Quartus® Prime software generates a
periphery to core placement and routing optimization summary in the Fitter (Place & Route) report after
In the Tasks pane, select Compilation.
Under Fitter (Place & Route), double-click
In the Fitter folder, expand the Place
Double-click Periphery to Core Transfer Optimization
Table 16. Fitter Report - Periphery to Core Transfer
Optimization (P2C) Summary
Routed—Core register is locked. Periphery to
core/core to periphery routing is committed.
Placed but not
Routed—Core register is locked. Routing is
not committed. This occurs when P2C is not able to optimize
all targeted paths within a single group, for example, the
same delay/wire requirement, or the same control signals.
Partial P2C routing commitments may cause unresolvable
Optimized—This occurs when P2C is set to
Auto and the path
is not optimized due to one of the following issues:
The delay requirement is impossible
The minimum delay requirement (for
hold timing) is too large. The P2C algorithm cannot
efficiently handle cases when many wires need to be
added to meet hold timing.
P2C encountered unresolvable routing
congestion for this particular path.
You can run procedures and make settings described in this manual
in a Tcl script. You can also run procedures at a command prompt.
detailed information about scripting command options, refer to the
Quartus® Prime command-line and Tcl API Help browser. To run the Help browser, type
the following command at the command prompt:
specify many of the options described in this section either in an instance, or at a
global level, or both.
Use the following Tcl command to make a
Quartus® Prime Settings File (.qsf) variable name in the Tcl assignment to make the setting along with
the appropriate value.
The Type column indicates whether the setting is
supported as a global setting, an instance setting, or both.
The top table lists the .qsf variable name and applicable values
for the settings described in the Initial Compilation: Required
Settings section in the Design Optimization Overview
chapter. The bottom table lists the advanced compilation settings.
The following revision
history applies to this chapter:
Added more information about register duplication methods in
Duplicate Logic for Fan-out Control topic.
Moved content related to manual register duplication from
Duplicate Logic for Fan-out Control topic to a
newly created sub-topic Manually Adding Duplicate
Added Automatic Register Duplication:
Estimated Physical Proximity and Automatic
Register Duplication: Hierarchical Proximity as new sub-topics under
Duplicate Logic for Fan-out Control to describe
automatic register duplication process.
Updated "Placement Effort Multiplier" figure and text
descriptions in "Adjust Placement Effort" topic.
Updated "Fitter Effort" figure and text descriptions in
"Adjust Fitter Effort" topic.
Updated "Optimize Hold Timing Option" screenshot in "Wires Added for Hold"
Removed duplicated topic: Resource
Utilization Optimization Techniques. The topic is now in the Area Optimization chapter.
Removed reference to unsupported CARRY and
CASCADE buffers from "Optimize IOC Register Placement for Timing Logic Option"
Added support for
Stratix® 10 Hyper-Retiming, Fast Forward compilation, and Fast Forward
Added topics: Critical Chains, Viewing Critical Chains,
Intel Stratix 10 Timing Closure Recommendations, Retiming Limit Details
Report, Using the Retiming Limit Details Report, Fast Forward Timing Closure
Recommendations, Generating Fast Forward Timing Closure Recommendations,
Implementing Fast Forward Recommendations.
Added topic about using partitions to
achieve timing closure.
Moved Topic: Design Evaluation for Timing Closure after
Initial Compilation: Optional Fitter Settings.
Removed statement about applying physical
synthesis optimizations in a portion of a design.
Removed references to optimizing hold
timing for selected paths.
Updated logic options about resource utilization optimization
Added topic: Critical Paths.
Timing and renamed to Register-to-Register Timing
Renamed topic: Timing Analysis with the
Timing Analyzer to Displaying Path Reports with
the Timing Analyzer.
Removed (LUT-Based Devices) remark from topic titles.
Renamed topic: Optimizing Timing
(LUT-Based Devices) to Timing
Renamed topic: Debugging Timing Failures
in the Timing Analyzer to Displaying Timing
Closure Recommendations for Failing Paths.
Multi-Corner Timing and Fitter Aggressive Routability Optimization.
Analysis with the Timing Analyzer to show how to access the Report All Summaries command.
Timing Constraints to include a help link to Fitter Summary
Reports with the Ignored Assignment
Renamed chapter title
from Area and Timing Optimization to Timing Closure and Optimization.
Removed design and
area/resources optimization information.
Added the following
Fitter Aggressive Routability Optimization.
Tips for Analyzing Paths from/to the Source and Destination
of Critical Path.
Tips for Locating Multiple Paths to the Chip Planner.
Tips for Creating a .tcl Script to Monitor Critical Paths
Compilation: Optional Fitter Settings” on page 13–2, “I/O Assignments” on
page 13–2, “Initial Compilation: Optional Fitter Settings” on page 13–2,
“Resource Utilization” on page 13–9, “Routing” on page 13–21, and “Resolving
Resource Utilization Problems” on page 13–43.
Multi-Corner Timing” on page 13–6, “Resource Utilization” on page 13–10, “Timing
Analysis with the Timing Analyzer” on page 13–12, “Using the Resource
Optimization Advisor” on page 13–15, “Increase Placement Effort Multiplier” on
page 13–22, “Increase Router Effort Multiplier” on page 13–22 and “Debugging
Timing Failures in the Timing Analyzer” on page 13–24.
Minor text edits
throughout the chapter.
Updated the “Timing
Requirement Settings”, “Standard Fit”, “Fast Fit”, “Optimize Multi-Corner
Timing”, “Timing Analysis with the Timing Analyzer”, “Debugging Timing Failures
in the Timing Analyzer”, “LogicLock Assignments”, “Tips for Analyzing Failing
Clock Paths that Cross Clock Domains”, “Flatten the Hierarchy During Synthesis”,
“Fast Input, Output, and Output Enable Registers”, and “Hierarchy Assignments”
Added the “Spine
Clock Limitations” section
Removed the Change
State Machine Encoding section from page 19
Minor text edits
throughout the chapter
in “Initial Compilation: Optional Fitter Settings” section
information to “Resource Utilization” section
information to “Duplicate Logic for Fan-Out Control” section
Added links to
Additional edits and
updates throughout chapter
Added links to
Added “Debugging Timing Failures in the Timing
Timing Analyzer references
Time Optimization Techniques section to new Reducing
Compilation Time chapter
to Timing Closure Floorplan
Compilation Setting and Early Timing Estimation sections to new Reducing Compilation Time chapter
As FPGA designs grow larger in density, the ability to analyze the
design for performance, routing congestion, and logic placement is critical to meet the
design requirements. This chapter
discusses how the Chip Planner and Logic Lock regions
help you improve your design's floorplan.
Design floorplan analysis helps to close
timing, and ensures optimal performance in highly complex designs. With analysis
Quartus® Prime Chip Planner helps you
close timing quickly on your designs. You can use the Chip Planner together with
Logic Lock regions to compile your designs
hierarchically and assist with floorplanning. Additionally, use partitions to
preserve placement and routing results from individual compilation runs.
You can perform design analysis, as well as
create and optimize the design floorplan with the Chip Planner. To make I/O
assignments, use the Pin Planner.
Note: As a best practice, define
resource placement with iterative design flows. Use techniques like the Early Place
Flow to guide your floorplanning decisions before setting hard placement
For information about the Early Place Flow, refer to the
Quartus® Prime Pro Edition User Guide: Compiler
For information about floorplanning a Partial Reconfiguration
design, refer to the
Quartus® Prime Pro Edition User Guide: Partial Reconfiguration
The Chip Planner simplifies floorplan
analysis by providing visual display of chip resources.
With the Chip
Planner, you can view post-compilation placement, connections, and routing paths.You can also
make assignment changes, such as creating and deleting resource
The Chip Planner showcases:
Relative resource usage
Detailed routing information
Fan-in and fan-out connections between nodes
Timing paths between registers
Delay estimates for paths
Routing congestion information
Starting the Chip Planner
To start the Chip Planner, select
Tools > Chip Planner.
You can also start the Chip Planner by the following
Click the Chip Planner icon on the
Quartus® Prime software
following tools, right-click any chip resource and select Locate > Locate in Chip
Logic Lock Regions Window
Technology Map Viewer
Report Timing panel of the Timing
Chip Planner GUI Components
Chip Planner Toolbar
The Chip Planner toolbar provides
powerful tools for visual design analysis.
You can access Chip Planner
commands either from the View
by clicking the icons in the toolbars.
The Chip Planner allows you to control the
display of resources.
Layers Settings Pane
With the Layers Settings pane, you can manage the
graphic elements that the Chip Planner displays.
You open the Layers Settings pane by clicking View > Layers Settings. The Layers
Settings pane offers layer presets, which group resources that are often
used together. The Basic, Detailed, and Floorplan
Editing default presets are useful for general assignment-related
activities. You can also create custom presets tailored to your
As you optimize your design floorplan, you
might have to locate a path or node in the Chip Planner more than once.
Locate History window lists all the nodes and paths you
have displayed using a Locate in Chip Planner command,
providing easy access to the nodes and paths of interest to you.
If you locate a required path from the Timing Analyzer Report Timing pane, the Locate History window displays the required clock path. If you locate an
arrival path from the Timing Analyzer Report Timing
pane, the Locate History window displays the path from
the arrival clock to the arrival data. Double-clicking a node or path in the Locate History window displays the selected node or path in the
Chip Planner Floorplan Views
The Chip Planner uses a hierarchical
zoom viewer that shows various abstraction levels of the targeted Intel device.
As you zoom in, the level
of abstraction decreases, revealing more details about your design.
Bird’s Eye View
The Bird’s Eye View displays a high-level
picture of resource usage for the entire chip and provides a fast and efficient way
to navigate between areas of interest in the Chip Planner.
The Bird’s Eye View is particularly useful when the
parts of your design that you want to view are at opposite ends of the
and you want to quickly navigate between resource elements without losing your frame
The Properties window displays detailed properties of the objects (such
as atoms, paths, Logic Lock regions, or routing
elements) currently selected in the Chip Planner. To display the Properties window, right-click the object and select
View > Properties.
The Chip Planner allows you to locate and report details on various elements of your
design, such as viewing available clock networks, routing congestion, I/O banks, and
high-speed serial interfaces in the floorplan.
The following section described how to view various design elements in the Chip
Viewing Architecture-Specific Design Information
The Chip Planner allows you to view
architecture-specific information related to your design.
By enabling the
options in the Layers
Settings pane, you can view:
Device routing resources used by your design—View how blocks
are connected, as well as the signal routing that connects the blocks.
logic element (LE) configuration in your design. For example, you can view which
LE inputs are used; whether the LE utilizes the register, the look-up table
(LUT), or both; as well as the signal flow through the LE.
configuration—View ALM configuration in your design. For
example, you can view which ALM inputs are used; whether the ALM utilizes the
registers, the upper LUT, the lower LUT, or all of them. You can also view the
signal flow through the ALM.
configuration—View device I/O resource usage. For example, you
can view which components of the I/O resources are used, whether the delay chain
settings are enabled, which I/O standards are set, and the signal flow through
PLL configuration—View phase-locked loop (PLL) configuration in
your design. For example, you can view which control signals of the PLL are used
with the settings for your PLL.
Timing—View the delay between the inputs and outputs of FPGA
elements. For example, you can analyze the timing of the DATAB input to the COMBOUT output.
When you enable a clock region layer in the Layers Settings pane, you display the areas of the chip
that are driven by global and regional clock networks.
When the selected device does
not contain a given clock region, the option for that category is unavailable in the
Figure 38. Clock Regions
Depending on the clock layers that you activate in the
Layers Settings pane, the Chip
Planner displays regional and global clock regions in the device, and the
connectivity between clock regions, pins, and PLLs.
Clock regions appear as rectangular overlay boxes with labels
indicating the clock type and index. Select a clock network region by clicking
the clock region. The clock-shaped icon at the top-left corner indicates that
the region represents a clock network region.
Spine/sector clock regions have a dotted vertical line in the
middle. This dotted line indicates where two columns of row clocks meet in a
To change the color in which the Chip Planner displays clock
regions, select Tools > Options > Colors > Clock Regions.
The Chip Planner provides a visual representation of a design's clock sector utilization.
To generate the report in the Chip Planner:
In the Tasks pane, double-click Report Clock Sector Utilization to open the Report Clock Sector Utilization dialog box.
If you want the report to include the source nodes, turn on Report source nodes.
The equivalent TCL command appears at the bottom of the Dialog Box.
The report output shows the most used clock sectors.
The Report pane displays a list of clock sectors, with colors according to utilization. The clock sector with the highest utilization appears in red, and the sector with least utilization appears in blue.
You can turn on or off the sector visibility from the Report pane. You can also highlight nodes, if applicable.
The Report Routing
Utilization task allows you to determine the percentage of routing
resources in use following a compilation.
This feature can identify zones
with lack of routing resources, helping you to make design changes to meet routing
congestion design requirements.
To view the routing congestion in the Chip Planner:
In the Tasks pane,
double-click the Report Routing
Utilization command to launch the Report Routing Utilization dialog box.
in the Report Routing Utilization dialog
box to preview the default congestion display.
Change the Routing Utilization
Type to display congestion for specific resources.
default display uses dark blue for 0% congestion (blue indicates zero
utilization) and red for 100%. You can adjust the slider for Threshold percentage to change the congestion
The congestion map helps you determine whether you can modify the floorplan,
or modify the RTL to reduce routing congestion. Consider:
The routing congestion map uses the color and shading of logic resources to
indicate relative resource utilization; darker shading represents a greater
utilization of routing resources. Areas where routing utilization exceeds the
threshold value that you specify in the Report
Routing Utilization dialog box appear in red.
To identify a lack of routing resources, you must investigate each routing
interconnect type separately by selecting each interconnect type in turn in the
Routing Utilization Settings dialog
The Compiler's messages contain information about average and peak
interconnect usage. Peak interconnect usage over 75%, or average interconnect
usage over 60%, can indicate difficulties fitting your design. Similarly, peak
interconnect usage over 90%, or average interconnect usage over 75%, show
increased chances of not getting a valid fit.
To view the I/O bank map of the device in
the Chip Planner, double-click Report All I/O Banks in
the Tasks pane.
Viewing High-Speed Serial Interfaces (HSSI)
The Chip Planner
displays a detailed block view of the receiver and transmitter channels of the
high-speed serial interfaces. To display the HSSI block view, double-click Report HSSI Block Connectivity in the Tasks pane.
Arria® 10 HSSI
Viewing the Source and Destination of Placed Nodes
The Chip Planner allows you to view the registered fan-in or fan-outs of nodes in compiled designs with the Report Registered Connections task. This report is different from the Generate Fanin/Fanout connections report in that the source and destination nodes appear without connection lines, which may obscure the view.
In the Chip Planner, select one or more nodes.
In the Task pane,
double-click Report Registered
Select the options from the dialog box, and click OK.
The Reports pane displays the registered source and destination nodes. Turn on or off to switch visibility in the graphic view.
Displays the immediate fan-in or fan-out
connection for the selected atom.
For example, when you view the immediate fan-in for a logic resource, you see
the routing resource that drives the logic resource. You can generate immediate
fan-ins and fan-outs for all logic resources and routing resources.
To display the immediate fan-in or fan-out connections, click View > Generate Immediate Fan-In Connections or View > Generate Immediate Fan-Out Connections.
To remove the connections displayed, use the Clear Unselected Connections icon in the Chip Planner toolbar.
Viewing Selected Contents
You can view a detailed report of the contents of any area that
you select in the Chip Planner. When you view the contents of a selected area, Chip
Planner generates a hierarchical, color coded list of the design elements in the
selection. This functionality allows you to quickly determine where the Compiler places
specific modules of the design.
Follow these steps to view selected contents in the Chip Planner:
In the Tasks pane,
double-click Report Selection Contents.
The Report Selection Contents dialog box
Under Report design instances in
selection, turn on or off Show
registers names and Show
combinational names to display names of those type in the
Figure 42. Report Selection
Contents Dialog Box
Click OK. The report
generates and displays the list of selected elements in the Reports pane.
Figure 43. Viewing Selected Contents
To customize the color coding of report folders, right-click any report, and
then click Properties. You can customize the
Report Name, Report Color, and
the Highlighted Area Minimum Size for the report.
Figure 44. Selected Entities Report Properties
Exploring Paths in the Chip Planner
Use the Chip Planner to explore paths between
logic elements. The following examples use the Chip Planner to traverse paths from the
Timing Analysis report.
Analyzing Connections for a Path
To determine the elements forming a selected
path or connection in the Chip Planner, click the Expand
Connections icon in the Chip Planner toolbar.
With the Show
you can view timing delays for paths
in Timing Analyzer reports.
To access this
click View > Show Delays in the main menu. Alternatively click the Show Delays icon in the Chip Planner toolbar. To see the partial delays on the selected
path, click the “+” sign next to the path delay displayed in the
For example, you can view the delay between two logic resources or
between a logic resource and a routing resource.
Figure 46. Show Delays Associated in a
Timing Analyzer Path
Viewing Routing Resources
With the Chip Planner and the Locate History window, you can view the routing
resources that a path or connection uses.
You can also select and display
the Arrival Data path and the Arrival Clock path.
Figure 47. Show Physical Routing
In the Locate History window,
right-click a path and select Show Physical
Routing to display the physical path. To adjust the display,
right-click and select Zoom to
Figure 48. Highlight Routing
To see the rows and columns where the Fitter routed the path,
right-click a path and select Highlight Routing.
You can view location assignments in the
Chip Planner by
selecting the appropriate layer, or any custom
preset that displays block utilization in the Layers
The Chip Planner displays assigned resources in a predefined color (gray, by
Figure 49. Viewing Assignments in the Chip
To create or move an assignment, or to
make node and pin location assignments to Logic Lock regions, drag the selected resource to a new location.
The Fitter applies the assignments that you create during the next
Viewing High-Speed and Low-Power Tiles in the Chip Planner
Some Intel devices have ALMs that can operate in either high-speed mode
or low-power mode.
The power mode is set during the fitting process in the
Quartus® Prime software. These ALMs are grouped together to
form larger blocks, called
To view a power map, double-click Tasks > Core
Reports > Report High-Speed/Low-Power Tiles after running the Fitter. The Chip Planner displays low-power and
high-speed tiles in contrasting colors; yellow tiles operate in a high-speed mode,
while blue tiles operate in a low-power
Figure 50. High-Speed and Low Power Tiles in an
Arria® 10 Device
Creating Partitions and Logic Lock Regions with the Design Partition Planner and the Chip Planner
Using Logic Lock regions with design partitions allows you to preserve the
location of a block while the Fitter works in other portions of the design.
When you use the Design Partition Planner with the Chip Planner, you can create
partitions and Logic Lock regions in a way that benefits
both the connectivity and physical locations of entities.
To use this technique in an
Quartus® Prime Pro Edition design:
Compile the design.
Open the Chip Planner and the Design Partition Planner.
Click Tools > Chip Planner
Click Tools > Design Partition Planner
In the Chip Planner
window, go to the Tasks pane, and
double-click Report Design
The Report Design Partitions task causes the Chip Planner to
display the physical locations of design entities using the same colors that the
entities displayed in the Design Partition Planner.
In the Chip Planner, click View > Bird's Eye View
The Bird's Eye View
In the Design Partition Planner, drag all the larger entities
out from their parents.
Alternatively, you can right-click the entity and click
Extract from Parent.
The Chip Planner displays the physical placement of the
entities shown in the Design Partition Planner, with consistent colors between
the two tools. You can view physical placement in the Chip Planner and
connectivity in the Design Partition Planner.
Identify entities that are unsuitable to place in Logic Lock regions:
The Chip Planner shows an entity to be physically
dispersed over noncontiguous areas of the device
The Design Partition Planner shows an entity to have a
large number of connections to other entities.
Return entities unsuitable to place in Logic Lock regions to their parent, by dragging
into the parent's entities.
Alternatively, right-click the entity and click Collapse to Parent
Create a partition for each remaining entity by right-clicking
the entity, and then clicking Create Design
Create a Logic Lock region
for each partition by right-clicking the partition, and then clicking Create Logic Lock
By default, when you open a compiled design, the Design Partition Planner
displays the design as a single top-level entity, containing lower-level entities.
If the Design Partition Planner has opened the design previously, the design appears
in its last state.
Figure 51. Top-Level Entity in the Design Partition Planner
To show connectivity between entities, extract entities from the
top-level entity by dragging them into the surrounding white space, or by
right-clicking an entity and clicking Extract from
Parent on the shortcut menu.
When you extract entities, Design Partition Planner
draws the connection bundles between entities, showing the number of connections
between pairs of entities.
Figure 52. Partitioned Design with Connection Bundles
To customize the appearance of connection bundles or to set
thresholds for connection counts, click View > Bundle Configuration, and set the necessary options in the Bundle Configuration dialog box.
To see bundles containing failing paths, open the Timing Analyzer, and then click View > Show Timing Data in the Design Partition Planner. Bundles containing failing paths
are displayed in red, as are entities having nodes that reside on failing
To see detailed information about the connections in a bundle,
right-click the bundle, and then click Bundle
Properties to open the Bundle Properties
To switch between connectivity display mode and hierarchical
display mode, click View > Hierarchy Display. Alternatively, click and hold the hierarchy icon
in the top-left corner of any entity to switch temporarily to a hierarchy
You can run procedures and specify the
settings described in this chapter in a Tcl script. You can also run some
procedures at a command prompt.
You can use the same command format to modify an existing assignment.
All instances with a routing region assignment must have a respective placement
region; the routing region must fully contain the placement region.
Specify a Region as Reserved
The following assignment reserves an existing region:
set_instance_assignment -name <instance name> RESERVE_PLACE_REGION -to <node names> ON
You can only reserve placement regions.
Specify a Region as Core Only
By default, the
Quartus® Prime Pro Edition software includes pins in Logic Lock assignments. To specify a region as core only (that is, periphery logic in the instance that is not
constrained), use the following assignment:
set_instance_assignment -name <instance name> CORE_ONLY_PLACE_REGION -to <node names> ON
By default, the
Quartus® Prime software
constrains every child instance to the Logic Lock region of its parent. Any constraint to a child
instance intersects with the constraint of its ancestors. For example, in the
following example, all logic beneath “a|b|c|d” constrains to box
(10,10), (15,15), and not (0,0), (15,15). This
result occurs because the child constraint intersects with the parent
By default, a Logic Lock region
constraint allows logic from other instances to share the same region. These
assignments place instance c and instance g in the
same location. This strategy is useful if instance c and instance
g are heavily interacting.
Optionally reserve an entire Logic Lock
region for one instance and any of its subordinate instances.
set_instance_assignment –name PLACE_REGION –to a|b|c "X10 Y10 X20 Y20"
set_instance_assignment –name RESERVE_PLACE_REGION –to a|b|c ON
# The following assignment causes an error. The logic in e|f|g is not
# legally placeable anywhere:
# set_instance_assignment –name PLACE_REGION –to e|f|g "X10 Y10 X20 Y20"
# The following assignment does *not* cause an error, but is effectively
# constrained to the box (20,10), (30,20), since the (10,10),(20,20) box is reserved
# for a|b|c
set_instance_assignment –name PLACE_REGION –to e|f|g "X10 Y10 X30 Y20"
Analyzing and Optimizing the Design Floorplan Revision History
revision history applies to this chapter:
Table 21. Document Revision History
Added new "Viewing Selected Contents" topic that describes a
new report listing selected design elements.
Added topic: Viewing Clock Sector Utilization
Added topic: Viewing the Source and Destination of Placed
Generating Fan-In and Fan-Out Connections to Viewing Fan-In
and Fan-Out Connections of Placed Resources.
recommendations for using iterative methods for floorplanning.
Changed instances of LogicLock Plus to Logic Lock.
Added support for auto-sized Logic Lock
Added support for empty Logic Lock
Added topics: Considerations on Using Auto
Sized Regions, Creating Partitions and Logic Lock Regions with the Design
Partition Planner and Chip Planner.
Chapter reorganization and content update.
Added figures: Clock Regions, Path List in the Locate History Window, Show Physical
Routing, Using the Add Rectangle Feature, Using the Subtract Rectangle
Feature, Creating a Hole in a LogicLock Region, Noncontiguous LogicLock
Region, Routing Regions, Logic Placed Outside of an Empty Region.
Updated figures: HSSI Channel Blocks,
Highlight Routing, High-Speed and Low Power Tiles in an Arria 10 Device, Show
Delays Highlight Routing, Viewing Assignments in the Chip Planner, LogicLock
Plus Regions Window, Using the Merge LogicLock Plus Region Command.
Created topics: Adding
Rectangle to a LogicLock Plus Region, Subtracting
Rectangle from a LogicLock Plus Region.
Moved topic: Viewing Critical
Paths to Timing Closure and Optimization
chapter and renamed to Critical Paths.
Renamed topic: Creating Non-Rectangular
LogicLock Plus Regions to Merging LogicLock Plus
Renamed topic: Chip Planner
Overview to Design Floorplan Analysis in the Chip
Renamed chapter from Analyzing and
Optimizing the Design Floorplan with the Chip Planner to Analyzing and Optimizing the Design Floorplan.
Implemented Intel rebranding.
Added topic describing how to create a hole in a LogicLock
Updated information on creating LogicLock Plus
Changed instances of Quartus II to Quartus Prime.
Added information on how to use LogicLock
Added information about color coding of
Updated description of Virtual Pins assignment
to clarify that assigned input is not available.
Removed HardCopy device information.
Updated “Viewing Routing Congestion” section
Updated references to Quartus UI controls for the Chip
Removed survey link.
Updated for the 11.0
Edited “LogicLock Regions”
Updated “Viewing Routing Congestion”
Updated Figures 15-4, 15-9, 15-10, and
Added Figure 15-6
Updated for the 10.1
to Timing Closure Floorplan; removed “Design Analysis Using the Timing Closure
Added links to
online Help topics
LogicLock Regions with the Design Partition Planner” section
Critical Paths” section
Updated format of
Document revision History table
device information throughout
sections related to the Timing Closure Floorplan for older device families. (For
information on using the Timing Closure Floorplan with older device families,
refer to previous versions of the Quartus Prime Handbook, available in the