Visible to Intel only — GUID: jbr1446662770544
Ixiasoft
Visible to Intel only — GUID: jbr1446662770544
Ixiasoft
4.1.2. Step 2: Add Pipeline Stages and Remove Asynchronous Resets
To add pipeline stages and remove asynchronous resets from the design:
- Open the Median_filter_<version>/Step_1/rtl/hyper_pipe.sv. This file defines a parameterizable hyper_pipe pipeline component that you can easily use in any design. The following shows this component's code with parameterizable width (WIDTH) and depth (NUM_PIPES):
module hyper_pipe #( parameter WIDTH = 1, parameter NUM_PIPES = 1) ( input clk, input [WIDTH-1:0] din, output [WIDTH-1:0] dout); reg [WIDTH-1:0] hp [NUM_PIPES-1:0]; genvar i; generate if (NUM_PIPES == 0) begin assign dout = din; end else begin always @ (posedge clk) hp[0] <= din; for (i=1;i < NUM_PIPES;i++) begin : hregs always @ ( posedge clk) begin hp[i] <= hp[i-1]; end end assign dout = hp[NUM_PIPES-1]; end endgenerate endmodule
- Use the parameterizable module to add some levels of pipeline stages to the locations that Fast Forward recommends. The following example shows how to add latency before the q output of the dff_3_pipe module:
. . . hyper_pipe #( .WIDTH (DATA_WIDTH), .NUM_PIPES(4) ) hp_d0 ( .clk(clk), .din(d0), .dout(q0_int) ); . . . always @(posedge clk) begin : register_bank_3u if(~rst_n) begin q0 <= {DATA_WIDTH{1'b0}}; q1 <= {DATA_WIDTH{1'b0}}; q2 <= {DATA_WIDTH{1'b0}}; end else begin q0 <= q0_int; q1 <= q1_int; q2 <= q2_int; end end
- Remove the asynchronous resets inside the dff_3_pipe module by simply changing the registers to synchronous registers, as shown below. Refer to Reset Strategies for general examples of efficient reset implementations.
always @(posedge clk or negedge rst_n) // Asynchronous reset begin : register_bank_3u if(~rst_n) begin q0 <= {DATA_WIDTH{1'b0}}; q1 <= {DATA_WIDTH{1'b0}}; q2 <= {DATA_WIDTH{1'b0}}; end else begin q0_reg <= d0; q1_reg <= d1; q2_reg <= d2; q0 <= q0_reg; q1 <= q1_reg; q2 <= q2_reg; end end always @(posedge clk) begin : register_bank_3u if(~rst_n_int) begin // Synchronous reset q0 <= {DATA_WIDTH{1'b0}}; q1 <= {DATA_WIDTH{1'b0}}; q2 <= {DATA_WIDTH{1'b0}}; end else begin q0 <= q0_int; q1 <= q1_int; q2 <= q2_int; end end
These RTL changes add five levels of pipeline to the inputs of the median_wrapper design (word0, word1, and word2 buses), and five levels of pipeline into the dff_3_pipe module. The following steps show the results of these changes. - To implement the changes, save all design changes and click Compile Design on the Compilation Dashboard.
- Following compilation, once again view the compilation results for the Clk clock domain in the Fast Forward Details report.
The report shows the effect of the RTL changes on the Base Performance fMAX of the design. The design performance now increases to 495 MHz.
The report indicates that you can achieve further performance improvement by removing more asynchronous registers, adding more pipeline registers, and addressing optimization limits of short path and long path. The following steps describe implementation of these recommendations in the design RTL.
Note: As an alternative to completing the preceding steps, you can open and compile the Median_filter_<version>/Step_1/median.qpf project file that already includes these changes, and then observe the results.