Support Knowledge Base

OpenVINO™ Inference Time Increases When Running Multiple Processes

Content Type: Product Information & Documentation | Article ID: 000058227 | Last Reviewed: 03/06/2026

Description Resolution Additional information

Ubuntu OpenVINO 2024.0

Description

The inferencing time doubles when running two processes to infer the same model.

Add the following configuration in your application.
Python*: core.set_property("CPU", {ov.properties.hint.enable_cpu_pinning(False)})
C++ : core.set_property("CPU", ov::hint::enable_cpu_pinning(false));
Build the demo.

Note

The ov::hint::enable_cpu_pinning property replaced the legacy CONFIG_KEY(CPU_BIND_THREAD) parameter starting from OpenVINO™ 2024.0.

The default value of enable_cpu_pinning property is enabled (true or YES) on Linux* and disabled (false or NO) on Windows* and macOS*.
On Linux, the default setting of enable_cpu_pinning is enabled, which in some scenarios can cause multiple processes to bind to the same CPU core, potentially doubling inference time.
Setting the property to false unbinds inferencing threads from specific CPU cores, allowing the operating system to schedule threads, which may be beneficial in applications with several parallel workloads.

Refer to Performance Hints and Thread Scheduling for more information.

This article applies to 1 products.

OpenVINO™ toolkit

Contact support