FPGA AI Suite: Design Examples User Guide

ID 848957
Date 4/30/2025
Public
Document Table of Contents

19.8.1. [SOC] Running the M2M Mode Demonstration Application

The M2M dataflow model uses the dla_benchmark demonstration application. The S2M bitstream supports both the M2M dataflow model and the S2M dataflow model.

You must know the host name of the SoC FPGA development kit. If you do not know the development kit host name, go back to [SOC] Determining the SoC FPGA Development Kit IP Address before continuing here.

To run inference on the SoC FPGA development kit:
  1. Open an SSH connection to the SoC FPGA development kit:
    1. Start a new terminal session
    2. Run the following command:
      build-host:$ ssh <devkit_hostname>

      Where <devkit_hostname> is the host name you determined in [SOC] Determining the SoC FPGA Development Kit IP Address.

      Continuing the example from [SOC] Determining the SoC FPGA Development Kit IP Address, the following command would open an SSH connection:
      build-host:$ ssh arria10-62747948036a.local
  2. In the SSH terminal, run the following commands:
    export compiled_model=~/resnet-50-tf/RN50_Performance_b1.bin
    
    export imgdir=~/resnet-50-tf/sample_images
    
    export archfile=~/resnet-50-tf/<architecture file>
    
    cd ~/app
    
    export COREDLA_ROOT=/home/root/app
    
    export LD_LIBRARY_PATH=.
    
    ./dla_benchmark \
       -b=1 \
       -cm $compiled_model \
       -d=HETERO:FPGA,CPU \
       -i $imgdir \
       -niter=8 \
       -plugins ./plugins.xml \
       -arch_file $archfile \
       -api=async \
       -groundtruth_loc $imgdir/TF_ground_truth.txt \
       -perf_est \
       -nireq=4 \
       -bgr
    where <architecture file> is one of the following files, depending on your development kit:
    • Agilex™ 5 FPGA E-Series 065B Modular Development Kit
      AGX5_Performance.arch
    • Agilex™ 7 FPGA I-Series Transceiver-SoC Development Kit
      AGX7_Performance_LayoutTransform.arch
    • Arria® 10 SX SoC FPGA Development Kit
      A10_Performance.arch
The dla_benchmark command generates output similar to the following example output for each step. This example output was generated using an Agilex™ 7 FPGA I-Series Transceiver-SoC Development Kit.
[Step 11/12] Dumping statistics report
count:             8 iterations
system duration:   286.0659 ms
IP duration:       64.9427 ms
latency:           138.7106 ms
system throughput: 27.9656 FPS
number of hardware instances: 1
number of network instances: 1
IP throughput per instance: 123.1856 FPS
IP throughput per fmax per instance: 0.3080 FPS/MHz
IP clock frequency measurement: 400.0000 MHz
estimated IP throughput per instance: 137.6405 FPS (500 MHz assumed)
estimated IP throughput per fmax per instance: 0.2753 FPS/MHz
[Step 12/12] Dumping the output values
[ INFO ] Comparing ground truth file /home/root/resnet-50-tf/sample_images/TF_ground_truth.txt with network Graph_0
top1 accuracy: 100 %
top5 accuracy: 100 %
[ INFO ] Get top results for "Graph_0" graph passed