Command used:
$ pwd
/home/centos/inference_engine_cpp_samples_build/intel64/Release
$ /opt/intel/oneapi/inspector/2021.3.0/bin64/inspxe-cl -c mi3 ./classification_sample_async -m /opt/intel/openvino_2021.3.394/deployment_tools/open_model_zoo/tools/downloader/intel/image-retrieval-0001/FP16-INT8/image-retrieval-0001.xml -i /home/centos/images -nt 8
$ /opt/intel/oneapi/inspector/2021.3.0/bin64/inspxe-cl -report observations
The Benchmark App was used as a reference to maximize inference performance. There were some codes relating to latency statistics in Benchmark App.
Removing the related snippets of the code will make the memory stable
It is not recommended to use benchmark_app application for stress testing as it uses private std::vector<double>_latencies in InferRequestsQueue class, which collects every latency value and calculates the median value of latency.