8.4.1. The streaming_inference_app Application
The streaming_inference_app application is an OpenVINO™ -based application. It loads a given precompiled ResNet50 network, then creates inference requests that are executed asynchronously by the Intel® FPGA AI Suite IP.
The resulting tensors are captured from the EMIF using the mSGDMA controller. The postprocessing required in the software involves converting the output tensors to floating point, assigning the values to the appropriate image classification, sorting the results, and selecting the top 5 classification results.
For each inference, the result is displayed on the terminal, and the results for each inference up to the 1000th one are logged in a results.txt file in the application folder.
You also need a compiled network binary file and an .arch file (which describes the Intel® FPGA AI Suite IP parameterization) to run inferences. These have been copied to the /home/root/resnet-50-tf directory.
root@arria10-1ac87246f24f:~# cd /home/root/app root@arria10-1ac87246f24f:~# export LD_LIBRARY_PATH=.
# ./streaming_inference_app -help Usage: streaming_inference_app -model=<model> -arch=<arch> -device=<device> Where: <model> is the compiled model binary file, eg /home/root/resnet-50-tf/RN50_Performance_no_folding.bin <arch> is the architecture file, eg /home/root/resnet-50-tf/A10_Performance.arch <device> is the OpenVINO device ID, eg HETERO:FPGA or HETERO:FPGA,CPU
# ./streaming_inference_app \ -model=/home/root/resnet-50-tf/RN50_Performance_no_folding.bin \ -arch=/home/root/resnet-50-tf/A10_Performance.arch \ -device=HETERO:FPGA
The distribution includes a shell script utility called run_inference_stream.sh which calls this command above.
Note that the layout transform IP core does not support folding on the input buffer. For streaming, you must use models that have been compiled by the dla_compiler command with the --ffolding-option=0 command line option specified.