Skip To Main Content
Support Knowledge Base

OpenVINO™ Inference Time Increases When Running Multiple Processes

Content Type: Product Information & Documentation   |   Article ID: 000058227   |   Last Reviewed: 03/06/2026

Environment

Ubuntu OpenVINO 2024.0

Description

The inferencing time doubles when running two processes to infer the same model.

  • Using OpenVINO™ to infer a model.
  • Inference time is about 300ms when running a single process.
  • When using two processes, the inference time for both processes becomes 600ms.

Resolution

  1. Add the following configuration in your application.
    Python*: core.set_property("CPU", {ov.properties.hint.enable_cpu_pinning(False)})
    C++ : core.set_property("CPU", ov::hint::enable_cpu_pinning(false));
  2. Build the demo.
NoteThe ov::hint::enable_cpu_pinning property replaced the legacy CONFIG_KEY(CPU_BIND_THREAD) parameter starting from OpenVINO™ 2024.0.

Additional information

  • The default value of enable_cpu_pinning property is enabled (true or YES) on Linux* and disabled (false or NO) on Windows* and macOS*.
  • On Linux, the default setting of enable_cpu_pinning is enabled, which in some scenarios can cause multiple processes to bind to the same CPU core, potentially doubling inference time.
  • Setting the property to false unbinds inferencing threads from specific CPU cores, allowing the operating system to schedule threads, which may be beneficial in applications with several parallel workloads.

Refer to Performance Hints and Thread Scheduling for more information.

Related Products

This article applies to 1 products.