Unable to Set the Number of Streams in OpenVINO™ Inference to More than One for NPU Devices
Content Type: Product Information & Documentation | Article ID: 000098287 | Last Reviewed: 04/11/2024
Unable to set the number of streams for NPU device in OpenVINO™ inference to more than one.
The inference execution through the NPU plugin is entirely offloaded to the NPU device, no processing occurs on the CPU.
The MTL firmware does not support real HW concurrency (executing multiple inferences in parallel on different tiles). Therefore, the NPU plugin forces NPU_STREAMS=1.