Video and Vision Processing Suite Intel® FPGA IP User Guide

ID 683329
Date 8/08/2022

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

3. Video and Vision Processing IPs Functional Description

Video and vision processing IPs conform to the Intel FPGA streaming video protocol.

Reset Behavior

IPs employ a synchronous reset and system resets must have a minimum duration of 256 clock cycles. In accordance with the AXI specification, all TVALID and TREADY signals from components drive low during reset and for at least one cycle after you deassert reset.

TUSER usage

The protocol specifies a TUSER width of TDATA/8 where TDATA is at least 16 bits and is always divisible by 8. The 2 LSBs of TUSER indicate whether a packet is control (full variants only) or data (bit 1) and indicate the start of a new field of video (bit 0). Intel video and vision processing IPs do not drive any unused bits (bit 2 upwards). Intel Quartus Prime optimizes them away during synthesis. The IPs ignore and do not propagate any data you drive on bits 2 and upwards of TUSER


The protocol specifies that an input interface can wait for TVALID to be asserted before asserting the corresponding TREADY. However, Intel video and vision processing IP sinks assert TREADY independently of whether the input TVALID is asserted. If a third-party IP drives a video and vision processing IP sink and it does not respect this AXI rule for sources, the video pipe still operates correctly.

Figure 2. Example video processing pipeline

The figure shows a typical video processing pipeline comprising video ingress and egress over HDMI, frame storage to DDR, and various video processing functions controlled by a processor.

If you turn off Lite mode for IPs, the pipeline includes the protocol converters to convert from lite mode, otherwise the IPs do not require them.

Video data passes along the pipeline in different formats in different places. The HDMI in connectivity IP passes clocked video to the clocked video to full-raster converter IP. That IP outputs a streaming full-raster format.

The full-raster to streaming video converter converts streaming full-raster data to Intel FPGA streaming video data packets. Then (optionally) the IP converts to full variants with additional metapackets by the protocol converter IP. The video data remains in this format until the end of the pipeline when the IPs perform reverse conversions.

Figure 3. Conversion of Intel clocked Video to full-raster video data.
Figure 4. Conversion of full-raster to streaming video data.

Figure 5. Conversion of lite to full Intel FPGA streaming video data.The figure shows the optional conversion from lite to full variants as performed by the protocol converter.

The Intel FPGA streaming video protocol states that IPs transmit video fields in packets of pixel data. One packet carries each line of video, with the start of field indicated by tuser[0]. The protocol converter supplements the pixel data packets with image information packets and end of field packets. The IP gains image information packet information from the protocol converter’s control registers

Most IPs update their behavior after the end of the current field, switching to any new control settings if required. Full variant IPs detect the end of the current field by the presence of the end of field packet. Lite variants IPs, without the benefit of metapackets, need to count the number of lines and compare this count with the value in the IMG_INFO_HEIGHT register. Alternatively, the IPs wait until they detect tuser[0], marking the start of the next field.