6.7. Performing Inference on the PCIe-Based Example Design
Monitoring Temperature on the Intel® PAC with Intel® Arria® 10 GX FPGA
As described in the "Additional Hardware Requirements for the Intel® PAC with Intel® Arria® 10 GX FPGA " section of PCIe-based Design Example Hardware Prerequisites, the Intel® PAC with Intel® Arria® 10 GX FPGA requires supplementary cooling. If the supplementary cooling is insufficient, the Intel® PAC with Intel® Arria® 10 GX FPGA hangs during inference.
Until you are certain that your cooling solution is sufficient, monitor the temperature with the sudo fpgainfo temp command.
Performing Inference Using JIT Mode
The JIT (just-in-time) mode causes the dla_benchmark demonstration application to call the dla_compiler command in a just-in-time way to compile the neural net graph.
If you do not have images and ground truth files, you can skip the optional -i and -groundtruth_loc parameter entries in the command that follows. If you skip these parameters, the dla_benchmark demonstration application generates randomized image data.
The value for $curarch must match the bitstream that you programmed in Programming the FPGA Device.
imagedir=$COREDLA_WORK/demo/sample_images xmldir=$COREDLA_WORK/demo/models/public/ $COREDLA_WORK/runtime/build_Release/dla_benchmark/dla_benchmark \ -b=1 \ -m $xmldir/resnet-50-tf/FP32/resnet-50-tf.xml \ -d=HETERO:FPGA,CPU \ -niter=8 \ -plugins_xml_file $COREDLA_WORK/runtime/plugins.xml \ -arch_file $curarch \ -api=async \ -perf_est \ -nireq=4 \ -bgr \ -i $imagedir \ -groundtruth_loc $imagedir/TF_ground_truth.txt
[Step 7/12] Loading the model to the device Error initializing DMA: invalid parameter Error initializing mmd dma eventfd : Bad file descriptor terminate called after throwing an instance of 'std::system_error' what(): No such process Aborted (core dumped)
To fix this error, reboot your system and try running the dla_benchmark command again.
The script is responsible for setting up huge pages, among other things.
Performing Inference Using AOT Mode
In AOT (ahead-of-time) mode, the dla_benchmark demonstration application uses a compiled network that was previously produced by the dla_compiler compiler command when you followed the steps in Running the Graph Compiler.
To use AOT mode instead of JIT mode:
- Add the -cm argument to specify the name of the file containing the compiled network
- Remove the -perf_est flag. The dla_benchmark demonstration application does not performance estimation in AOT mode.
If you omit -i and -groundtruth_loc arguments, the dla_benchmark demonstration application generates random input data that is useful only for performance benchmarking.
gt_file=$COREDLA_WORK/demo/sample_images/TF_ground_truth.txt $COREDLA_WORK/runtime/build_Release/dla_benchmark/dla_benchmark \ -b=1 \ -cm $COREDLA_WORK/demo/RN50_Performance_b1.bin \ -d=HETERO:FPGA,CPU \ -niter=8 \ -plugins_xml_file $COREDLA_WORK/runtime/plugins.xml \ -arch_file $curarch \ -api=async \ -nireq=4 \ -bgr \ -i $COREDLA_WORK/demo/sample_images/ \ -groundtruth_loc $gt_file
The -cm argument points to the .bin file that you created in Running the Graph Compiler.
Inference APIs
The easiest way to evaluate the ability of the Intel® FPGA AI Suite to perform inference is to use the dla_benchmark demonstration application that is included in the example runtime and is built as part of the steps described in Programming the FPGA Device.
The example runtime also includes instructions on how to use the OpenVINO™ Python API to execute inference using the JIT style described in Performing Inference Using JIT Mode.
These instructions are located in $COREDLA_WORK/runtime/python_demos/README.md.