ID 683353
Date 10/04/2021
Public

## 4.1.4. Step 4: Optimize Short Path and Long Path Conditions

After removing asynchronous registers and adding pipeline stages, the Fast Forward Details report suggests that short path and long path conditions limit further optimization. In this example, the longest path limits the fMAX for this specific clock domain. To increase the performance, follow these steps to reduce the length of the longest path for this clock domain.
1. To view the long path information, click the Critical Chain Details tab in the Fast Forward Details report. Review the structure of the logic around this path, and consider the associated RTL code. This path involves the node module of the node.v file. The critical path relates to the computation of registers data_hi and data_lo, which are part of several comparators.

The following shows the original RTL for this path:

always @(*)
begin : comparator
if(data_a < data_b) begin
sel0 = 1'b0; // data_a : lo / data_b : hi
end else begin
sel0 = 1'b1; // data_b : lo / data_a : hi
end
end

always @(*)
begin : mux_lo_hi
case (sel0)
1'b0 :
begin
if(LOW_MUX == 1)
data_lo = data_a;
if(HI_MUX == 1)
data_hi = data_b;
end
1'b1 :
begin
if(LOW_MUX == 1)
data_lo = data_b;
if(HI_MUX == 1)
data_hi = data_a;
end
default :
begin
data_lo = {DATA_WIDTH{1'b0}};
data_hi = {DATA_WIDTH{1'b0}};
end
endcase
end

The Compiler infers the following logic from this RTL:

• A comparator that creates the sel0 signal
• A pair of muxes that create the data_hi and data_lo signals, as the following figure shows:
Figure 99. Node Component Connections
2. Review the pixel_network.v file that instantiates the node module. The node module's outputs are unconnected when you do not use them. These unconnected outputs result in no use of the LOW_MUX or HI_MUX code. Rather than inferring muxes, use bitwise logic operation to compute the values of the data_hi and data_lo signals, as the following example shows:
reg [DATA_WIDTH-1:0] sel0;

always @(*)
begin : comparator
if(data_a < data_b) begin
sel0 = {DATA_WIDTH{1'b0}}; // data_a : lo / data_b : hi
end else begin
sel0 = {DATA_WIDTH{1'b1}}; // data_b : lo / data_a : hi
end

data_lo = (data_b & sel0) | (data_a & sel0);
data_hi = (data_a & sel0) | (data_b & sel0);
end
3. Once again, compile the design and view the Fast Forward Details report. The performance increase is similar to the estimates, and short path and long path combinations no longer limit further performance. After this step, only a logical loop limits further performance.
Figure 100. Short Path and Long Path Conditions Optimized
Note: As an alternative to completing the preceding steps, you can open and compile the Median_filter_<version>/Final/median.qpf project file that already includes these changes, and then observe the results.