37.3.3. Warp IP Latency
The Warp IP latency is:
- A quarter of a frame up to a whole frame, if you lock the input and output frame rates,
- Between one and two frames if you turn on Use Easy warp and for normal latency behavior when you turn on Use single memory bounce.
- Between two to three frames for normal latency behavior when you turn off Use single memory bounce.
The IP does not control the relative timings of the input and output processes. Their timings can be asynchronous relative to each other. The flow control from the video pipeline that is in place downstream of the output determines the exact timings of the output video data from the IP.
In a simple implementation, the output frames in the system have no timing relationship with the input frames. The IP automatically adapts to their different frame rates by dropping and repeating frames of video data as necessary.
In a sophisticated implementation, you can lock the input and output frame rates with a fixed phase relationship existing between the input and output frame timings. You must implement this frame rate locking as part of the system level design, for example with an external genlock.
The figures show only the vertical blanking periods in active frames of video data. Assume any horizontal blanking periods are within the active video periods.
The figure shows the buffering and processing of a video frame (frame1) through the Warp IP from the input, through the engine(s) to the output when Use single memory bounce is off. The latency is from two to three frames. The end of active video at the output synchronizes the start of the processing for the next frame.
When the IP receives the whole of the input frame (frame 1), the internal processing engines can read the buffered frame from memory. Because the exact timing of the internal processing is synchronized with the timing of the output frames, the IP may see a delay of up to one frame before the internal processing engines use frame 1. When the IP completes the processing of the frame, the output process begins producing the resultant video data.
Warp IP with Low Latency
The figure shows the low latency behavior when the input and output video frame rates and the input to output phase offset are locked.
The figure shows the buffering and processing of frame1 through the Warp IP from the input, through the engines to the output. The latency in this example can range from approximately a quarter to a whole frame depending on the warp transform that you apply.
Locking the input and output frame rates offset allows the IP to use video data as it buffers in memory, before it stores the whole frame. The figure shows a fixed relationship between the phasing of the input and output video frames and the end of active video at the output is no longer coincident with the start of the internal processing.
The engines begin processing the data from frame1 as soon as the IP stores sufficient input data, while it is receiving the frame. Similarly, the output process begins producing the video data associated with frame1 while the IP writes it to memory after processing by the engines. The delay from the start of the internal processing to the point where the output process can start reading the data from memory is a fixed amount based on the resolution of the output image the IP processes. However, the delay from the start of the input frame to the point where the internal processing can start using the frame data is a function of the warp transform. For this reason, the software API provides support in calculating these delays.
Setting for Low Latency
For low latency behavior, program the correct settings in the IP and ensure the input and output frames have a fixed offset. The video system outside of the IP controls the fixed offset. This offset has a minimum setting based on the warp transform that you require. The GenerateLatencyParams() call, in the IP’s software API, generates the values for the two settings for low-latency behavior. One of the settings for the IP programs the delay between the internal processing and the generation of active video at the output. The other setting is the delay, in clock cycles, between the input and output frames.
In addition to the required warp mesh, the GenerateLatencyParams() call requires the following information to calculate the settings:
System_clock – the frequency of core_clock, in Hz.
Video_clock – the frequency of axi4s_vid_out_0_clock, in Hz.
Full_height – the full height, in lines, of the resolution being processed including the vertical blanking.
Frame_rate – the output frame rate, in hundredths of a hertz.
Knowing the clock rates, the frame rate and the full height of the video frames enables the IP to calculate the number of clocks for each line of video data. Then, by examining the maximum line offsets implied by the required warp transform, the IP calculates how many clock cycles of delay it requires.
The GenerateLatencyParams() call returns two values via the structure WarpLatencyParams. These two values are:
_total_latency – the minimum delay, in axi4s_vid_out_0_clock clock cycles, between the input and output frames.
_output_latency – the value to apply to the IP using the intel_vvp_warp_set_output_latency() call. For more information, refer to Warp IP Software Code Examples
Calling intel_vvp_warp_set_output_latency() with a non-zero clock_offset value switches the Warp IP core into the low latency mode of operation. Using a zero clock_offset value in the call to intel_vvp_warp_set_output_latency() puts the Warp IP in to the normal latency mode.
Easy Warp IP Latency
When the IP receives the whole of the first input frame (frame 0), it can read the buffered data from memory by the output process of the Warp IP (which gives one frame of latency). If you do not lock the input and output frame rates, you can see up to a frame’s delay before the IP produces the output video data, which gives a total of two frames delay maximum.