Intel® FPGA AI Suite: PCIe-based Design Example User Guide

ID 768977
Date 9/06/2023
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

5.6.3. Additional dla_benchmark Options

The dla_benchmark tool is part of the example design and the distributed runtime includes full source code for the tool.
Table 3.  Command Line dla_benchmark Options
Command Description
-nireq=<N> This controls the number of simultaneous inference requests that are sent to the FPGA.

Typically, this should be at least twice the number of IP instances; this ensures that each IP can execute one inference request while dla_benchmark loads the feature data for a second inference request to the FPGA-attached DDR memory.

-b=<N>

--batch-size=<N>

This controls the batch size.

A batch size greater than 1 is created by repeating configuration data for multiple copies of the graph.

A batch size of 1 is typically best.

-niter=<N> Number of images to process in each batch.
-d=<STRING> Using -d=HETERO:FPGA, CPU causes dla_benchmark to use the OpenVINO™ heterogeneous plugin to execute inference on the FPGA, with fallback to the CPU for any layers that cannot go to the FPGA.

Using -d=HETERO:CPU or -d=CPU executes inference on the CPU, which may be useful for testing the flow when an FPGA is not available. Using -d=HETERO:FPGA may be useful for ensuring that all graph layers are accelerated on the FPGA (and an error is issued if this is not possible).

-arch_file=<FILE>

--arch=<FILE>

This specifies the location of the .arch file that was used to configure the IP on the FPGA. The dla_benchmark will issue an error if this does not match the.arch file used to generate the IP on the FPGA.
-m=<FILE>

--network_file=<FILE>

This points to the XML file from OpenVINO™ Model Optimizer that describes the graph. The BIN file from Model Optimizer must be kept in the same directory and same filename (except for the file extension) as the XML file.
-i=<DIRECTORY> This points to the directory containing the input images. Each input file corresponds to one inference request. The files are read in order sorted by filename; set the environment variable VERBOSE=1 to see details describing the file order.
-api=[sync|async] The -api=async option allows dla_benchmark to fully take advantage of multithreading to improve performance. The -api=async option may be used during debug.
-groundtruth_loc=<FILE> Location of the file with ground truth data. If not provided, then dla_benchmark will not evaluate accuracy. This may contain classification data or object detection data, depending on the graph.
-yolo_version=<STRING> This option is used when evaluating the accuracy of a YOLOv3 or TinyYOLOv3 object detection graph. The options are yolo-v3-tf and yolo-v3-tiny-tf.
-enable_object_detection_ap This option may be used with an object detection graph (YOLOv3 or TinyYOLOv3) to calculate the object detection accuracy.
-bgr When used, this flag indicates that the graph expects input image channel data to use BGR order.
-plugins_xml_file=<FILE> This option specifies the location of the file specifying the OpenVINO™ plugins to use. This should be set to $COREDLA_ROOT/runtime/plugins.xml in most cases. If you are porting the design to a new host or doing other development, it may be necessary to use a different value.