Visible to Intel only — GUID: GUID-41F4FECB-AE17-4798-8001-46450FD7D6FA
Visible to Intel only — GUID: GUID-41F4FECB-AE17-4798-8001-46450FD7D6FA
OpenVINO™ Benchmarking Tool
This tutorial tells you how to run the benchmark application on an 11th Generation Intel® Core™ processor with an integrated GPU. It uses the asynchronous mode to estimate deep learning inference engine performance and latency.
Start Docker* Container
Check if your installation has the eiforamr-full-flavour-sdk Docker* image.
docker images |grep eiforamr-full-flavour-sdk #if you have it installed, the result is: eiforamr-full-flavour-sdk
NOTE:If the image is not installed, continuing with these steps triggers a build that takes longer than an hour (sometimes, a lot longer depending on the system resources and internet connection).If the image is not installed, Intel recommends installing the Robot Complete Kit with the Get Started Guide for Robots.
Go to the AMR_containers folder:
cd <edge_insights_for_amr_path>/Edge_Insights_for_Autonomous_Mobile_Robots_<version>/AMR_containers
Start the Docker* container as root:
./run_interactive_docker.sh eiforamr-full-flavour-sdk:<TAG> root
Set Environment Variables
The environment variables must be set before you can compile and run OpenVINO™ applications.
Run the following script:
source /opt/intel/openvino/bin/setupvars.sh --or-- source <OPENVINO_INSTALL_DIR>/bin/setupvars.sh
Build Benchmark Application
Change directory and build the benchmark application using the cmake script file using the following commands:
cd /opt/intel/openvino/inference_engine/samples/cpp ./build_samples.sh
Once the build is successful, access the benchmark application in the following directory:
cd /root/inference_engine_cpp_samples_build/intel64/Release -- or -- cd <INSTALL_DIR>/inference_engine_cpp_samples_build/intel64/Release
The benchmark_app application is available inside the Release folder.
Input File
Select an image file or a sample video file to provide an input to the benchmark application from the following directory:
cd /root/inference_engine_cpp_samples_build/intel64/Release
Application Syntax and Options
The benchmark application syntax is as follows:
./benchmark_app [OPTION]
In this tutorial, we recommend you select the following options:
./benchmark_app -m <model> -i <input> -d <device> -nireq <num_reqs> -nthreads <num_threads> -b <batch> where: <model>-------------The complete path to the model .xml file <input>-------------The path to the folder containing image or sample video file. <device>------------The device type can be GPU or CPU etc., <num_reqs>----------No of parallel inference requests <num_threads>-------No of threads to use for inference on the CPU (throughput mode) <batch>-------------Batch size
For complete details on the available options, run the following command:
./benchmark_app -h
Run the Application
The benchmark application is executed as seen below. This tutorial uses the following settings:
Benchmark application is executed on frozen_inference_graph model.
Number of parallel inference requests is set as 8.
Number of CPU threads to use for inference is set as 8.
Device type is GPU.
./benchmark_app -d GPU -i ~/<dir>/input/ -m /home/eiforamr/workspace/object_detection/src/object_detection/models/ssd_mobilenet_v2_coco/frozen_inference_graph.xml -nireq 8 -nthreads 8 ./benchmark_app -d GPU -i /home/eiforamr/data_samples/media_samples/plates_720.mp4 -m /home/eiforamr/workspace/object_detection/src/object_detection/models/ssd_mobilenet_v2_coco/frozen_inference_graph.xml -nireq 8 -nthreads 8
Expected output:
[Step 1/11] Parsing and validating input arguments [ INFO ] Parsing input parameters [ INFO ] Files were added: 1 [ INFO ] /home/eiforamr/data_samples/media_samples/plates_720.mp4 [Step 2/11] Loading Inference Engine [ INFO ] InferenceEngine: API version ............ 2.1 Build .................. 2021.2.0-1877-176bdf51370-releases/2021/2 Description ....... API [ INFO ] Device info: GPU clDNNPlugin version ......... 2.1 Build ........... 2021.2.0-1877-176bdf51370-releases/2021/2 [Step 3/11] Setting device configuration [ WARNING ] -nstreams default value is determined automatically for GPU device. Although the automatic selection usually provides a reasonable performance,but it still may be non-optimal for some cases, for more information look at README. [Step 4/11] Reading network files [ INFO ] Loading network files [ INFO ] Read network took 89.49 ms [Step 5/11] Resizing network to match image sizes and given batch [ INFO ] Network batch size: 1 [Step 6/11] Configuring input of the model [Step 7/11] Loading the model to the device [ INFO ] Load network took 44714.68 ms [Step 8/11] Setting optimal runtime parameters [Step 9/11] Creating infer requests and filling input blobs with images [ INFO ] Network input 'image_tensor' precision U8, dimensions (NCHW): 1 3 300 300 [ WARNING ] No supported image inputs found! Please check your file extensions: bmp, dib, jpeg, jpg, jpe, jp2, png, pbm, pgm, ppm, sr, ras, tiff, tif [ INFO ] Infer Request 0 filling [ INFO ] Fill input 'image_tensor' with random values (image is expected) [ INFO ] Infer Request 1 filling [ INFO ] Fill input 'image_tensor' with random values (image is expected) [ INFO ] Infer Request 2 filling [ INFO ] Fill input 'image_tensor' with random values (image is expected) [ INFO ] Infer Request 3 filling [ INFO ] Fill input 'image_tensor' with random values (image is expected) [ INFO ] Infer Request 4 filling [ INFO ] Fill input 'image_tensor' with random values (image is expected) [ INFO ] Infer Request 5 filling [ INFO ] Fill input 'image_tensor' with random values (image is expected) [ INFO ] Infer Request 6 filling [ INFO ] Fill input 'image_tensor' with random values (image is expected) [ INFO ] Infer Request 7 filling [ INFO ] Fill input 'image_tensor' with random values (image is expected) [Step 10/11] Measuring performance (Start inference asynchronously, 8 inference requests using 2 streams for GPU, limits: 60000 ms duration) [ INFO ] First inference took 10.01 ms [Step 11/11] Dumping statistics report Count: 9456 iterations Duration: 60066.11 ms Latency: 51.33 ms Throughput: 157.43 FPS
Benchmark Report
Sample execution results using an 11th Gen Intel® Core™ i7-1185GRE @ 2.80 GHz.
Read network time (ms) |
89 |
Load network time (ms) |
44714.68 |
First inference time (ms) |
10.01 |
Total execution time (ms) |
60066.11 |
Total num of iterations |
9456 |
Latency (ms) |
51.33 |
Throughput (FPS) |
157.43 |
Troubleshooting
For general robot issues, go to: Troubleshooting for Robot Tutorials.
Did you find the information on this page useful?
Feedback Message
Characters remaining: